Agent Learning Framework (ALF) is a reinforcement learning framework emphasizing on the flexibility of writing complex model architectures. ALF is built on Tensorflow 2.0.
- A2C: OpenAI Baselines: ACKTR & A2C
- DDPG: Lillicrap et al. "Continuous control with deep reinforcement learning" arXiv:1509.02971
- PPO: Schulman et al. "Proximal Policy Optimization Algorithms" arXiv:1707.06347
- SAC: Haarnoja et al. "Soft Actor-Critic Algorithms and Applications" arXiv:1812.05905
- ICM: Pathak et al. "Curiosity-driven Exploration by Self-supervised Prediction" arXiv:1705.05363
- MERLIN: Wayne et al. "Unsupervised Predictive Memory in a Goal-Directed Agent"arXiv:1803.10760
You can run the following commands to install ALF
git clone https://github.com/HorizonRobotics/alf
cd alf
git submodule update --init --recursive
cd tf_agents
pip install -e .
cd ..
pip install -e .
All the examples below are trained on a single machine Intel(R) Core(TM) i9-7960X CPU @ 2.80GHz with 32 CPUs and one RTX 2080Ti GPU.
You can train model of the examples using the following command:
python -m alf.bin.main --gin_file=GIN_FILE --root_dir=LOG_DIR
GIN_FILE is the file of gin configuration. You can find sample gin configuration files for different tasks under directory alf/examples. LOG_DIR is the directory when you want to store the training results.
During training, you can use tensorboard to show the progress of training:
tensorboard --logdir=LOG_DIR
After training, you can visualize the trained model using the following command:
python -m alf.bin.main --play --root_dir=LOG_DIR
-
Cart pole. The training score took only 30 seconds to reach 200, using 8 environments.
-
Atari games. Need to install python package atari-py for atari game environments. The evaluation score (by taking argmax of the policy) took 1.5 hours to reach 800 on Breakout, using 64 environments.
-
Simple navigation with visual input. Follow the instruction at SocialRobot to install the environment.
- PR2 grasping state only. Follow the instruction at SocialRobot to install the environment.
-
Humonoid. Learning to walk using the pybullet Humanoid environment. Need to install python pybullet>=2.5.0 for the environment. The training score took 1 hour 40 minutes to reach 2k, using asynchronous training with 2 actors (192 environments).
-
Super Mario. Playing Super Mario only using intrinsic reward. Python package gym-retro>=0.7.0 is required for this experiment and also a suitable
SuperMarioBros-Nes
rom should be obtained and imported (roms are not included in gym-retro). See this doc on how to import roms.