The hybridcov from hetankevin

hybridcov's Introduction

This is the repository containing the numerical experiments in a paper by me and Ziping, "A Natural Extension To Online Algorithms For Hybrid RL With Limited Coverage", which you can find at https://arxiv.org/abs/2403.09701.

The simulations demonstrate that appending offline data to the experience replay buffer can encourage sufficient exploration for the portion of the state-action space that does not have good coverage. Doing so leads to a simple, but effective, extension of an online RL algorithm to the setting of hybrid RL.

That is, where the "offline partition" is the portion of the state-action space well-visited by the behavior policy, and the "online partition" is its complement, We see that the hybrid RL algorithm visits the online partition more often than the online RL algorithm, and vice-versa for the offline partition. This happens in both a tabular forest management simulator, and a Tetris simulator we cast as a linear MDP.

Recommend Projects

hetankevin / hybridcov Goto Github PK

hybridcov's Introduction

hybridcov's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent