realvnf / distributed-drl-coordination Goto Github PK

Distributed Online Service Coordination Using Deep Reinforcement Learning

Python 96.86% Shell 3.14%

python reinforcement-learning distributed-systems service service-management orchestration scaling placement scheduling routing networking acktr actor-critic stable-baselines tensorflow online-algorithm coordination

distributed-drl-coordination's Introduction

Distributed Online Service Coordination Using Deep Reinforcement Learning

Self-learning and self-adaptive service coordination using deep reinforcement learning (DRL). Service coordination includes scaling and placement of chained service components as well as scheduling and routing of flows/requests through the placed instances. We train our proposed DRL approach offline in a centralized fashion and then deploy a distributed DRL agent at each node in the network. This fully distributed DRL approach only requires local observation and control and significantly outperforms existing state-of-the-art solutions.

Citation

If you use this code, please cite our paper:

@inproceedings{schneider2021distributed,
	title={Distributed Online Service Coordination Using Deep Reinforcement Learning},
	author={Schneider, Stefan and Qarawlus, Haydar and Karl, Holger},
	booktitle={IEEE International Conference on Distributed Computing Systems (ICDCS)},
	year={2021},
	organization={IEEE},
	note={to appear}
}

Installation

This package requires stable_baselines to work. Before installing, make sure the following packages are installed on the system.

# On Ubuntu
sudo apt-get update && sudo apt-get install cmake libopenmpi-dev python3-dev zlib1g-dev libgl1-mesa-glx libsm6 libxext6
# check your python3 version
python3 --version

If your Python version is neither 3.6 nor 3.7 (3.8+ does not support TensorFlow1, which is currently required by stable_baselines), manually install the correct version as described here:

sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.7 python3.7-dev

The package can then be installed as follows, requiring Python 3.6 or 3.7 :

# Create a python 3 virtual environment
python3 -m venv venv
# if python3 != 3.6 or 3.7, use the manually installed python3.7 instead (see above)

# Activate the virtual environment
source venv/bin/activate

# Update pip
pip install -U pip

# Install the DRL package and its requirements (-e is for dev install)
pip install -e .

On Windows 10+, you might need to install MPI separately: https://stackoverflow.com/a/54907810/2745116

Usage

The inputs available for the DRL agent are placed in the inputs folder.

The inputs folder is structured as follows:

.
├── Config:
│   ├── drl: contains the configuration YAML files for the DRL agent
│   └── simulator: contains the simulator configuration files defining the traffic patterns
├── networks: contains the network GraphML files.
├── services: contains the configuration files for the defined services and their components
└── traces: contains a trace to dynamically change the traffic pattern during simulations

Based on these inputs and after installation, a correct installation of the DRL agent can be checked as follows (ignore tensorflow warnings):

$ spr-rl --help

Usage: spr-rl [OPTIONS] NETWORK AGENT_CONFIG SIMULATOR_CONFIG SERVICES TRAINING_DURATION

  SPR-RL DRL Scaling and Placement main executable

Options:
  -s, --seed INTEGER       Set the agent's seed
  -t, --test TEXT          Path to test timestamp and seed
  -a, --append_test        test after training
  -m, --model_path TEXT    path to a model zip file
  -ss, --sim-seed INTEGER  simulator seed
  -b, --best               Select the best agent
  --help                   Show this message and exit.

To train an ACKTR DRL agent for 100,000 steps then test it immediately on the Abilene network, run the following example command from within this directory as follows:

$ spr-rl inputs/networks/abilene_1-5in-1eg/abilene-in1-rand-cap0-2.graphml inputs/config/drl/acktr/acktr_default_4-env.yaml inputs/config/simulator/mean-10.yaml inputs/services/abc-start_delay0.yaml 100000 -a

The results and trained model are saved in the results directory, which is created automatically.

Tensorboard

To visualize training progress, start tensorboard:

tensorboard --logdir tb

Then go to http://localhost:6006

Parallelization

The training of the agent can be parallelized via the GNU Parallel tool. A helper scripts is already provided in the utils folder. The inputs of the agent must be defined in the corresponding *.txt files inside the utils folder To run the parallel scripts: From the current directory, run the following command:

$ ./utils/parallel.sh

Related Projects

Distributed DRL Coordination with Ray RLlib: https://github.com/burnCalories/distributed_VNF
Single-agent DRL Coordination (DeepCoord): https://github.com/RealVNF/DeepCoord
Coordination Simulation (coord-sim): https://github.com/RealVNF/coord-sim

distributed-drl-coordination's People

Contributors

Stargazers

Watchers

Forkers

datlq95 tuantrung sleeping-zhen waiting-windy stefanbschneider tacaotuesday23

distributed-drl-coordination's Issues

Hi, why 'flow_size_shape' is so small, is it practical? thanks

Hi, thanks for your opend source code. I am trying to understand your code and may cite your paper in my future work. I am wondering why you set 'flow_size_shape' so small for deterministic setting, which means the duration of the flow is only 1 time unit if your 'flow_dr_mean' is 1 by default, while your link propagation delay or VNF process delay is serval time units which means your will release your flows very quickly after your create them?
Can I set this 'flow_size_shape' much larger than the propagation delay and what is the 'run_duration' in your code and how to use it? The default or 'run_duration' is 100, why? thanks.

Question about upgrading d-drl-coordination to multi-agent drl

hi @stefanbschneider
Recently, I have finally completed the task of migrating d-drl-coordination SB3 to the rllib version. After adding the curiosity module, I found that a similar success rate can be achieved even with rewards of success and failure.

I have some interesting ideas. d-drl-coordination is coordinated based on flows with different arrival times. If DRL is upgraded to MARL, for example, three to four agents can process flows in parallel. There may be a higher success rate in MMPP mode or det real world trace mode.

After almost searching the documentation and community of rllib, I found that there is very little information on how to modify a custom environment to a multi-agent environment in it. Could you help me providing some information about this? :)

I would be happy to share my project with you. However, I have been quite busy recently. Once I have time to upload the complete project, I will let you know as soon as possible. :)

Questions and ideas about moving to hierarchical multi-agent DRL

I would be happy to share my project with you. However, I have been quite busy recently. Once I have time to upload the complete project, I will let you know as soon as possible. :)

Question about SPRPolicy

This program is great, with excellent success rates and speed. I also tried acer and ppo2 algorithms. acer can be very close to the success rate of the acktr algorithm, and ppo2 is relatively low. Recently, I was reading the source code, but I don't understand neural networks very well, so I don't quite understand rewriting MLP policy.
class SPRPolicy(FeedForwardPolicy): """ Custom policy. Exactly the same as MlpPolicy but with different NN configuration """ def __init__(self, sess, ob_space, ac_space, n_env, n_steps, n_batch, reuse=False, **_kwargs): self.params: Params = _kwargs['params'] pi_layers = self.params.agent_config['pi_nn'] vf_layers = self.params.agent_config['vf_nn'] activ_function_name = self.params.agent_config['nn_activ'] activ_function = eval(activ_function_name) net_arch = [dict(vf=vf_layers, pi=pi_layers)] super(SPRPolicy, self).__init__(sess, ob_space, ac_space, n_env, n_steps, n_batch, reuse, net_arch=net_arch, act_fun=activ_function, feature_extraction="spr", **_kwargs)
I found that deep coord seems to use a similar neural network structure. Can you briefly describe why you want to rewrite policy and what are the advantages of doing so? Looking forward to your reply:)

Installation times out

Hi, When I run pip install -r requirements.txt, it always times out:

 error: subprocess-exited-with-error
  
  × git clone --filter=blob:none --quiet git://github.com/RealVNF/common-utils /home/guohao/PycharmProjects/codes/venv/src/common-utils did not run successfully.
  │ exit code: 128
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet git://github.com/RealVNF/common-utils /home/guohao/PycharmProjects/codes/venv/src/common-utils did not run successfully.
│ exit code: 128
╰─> See above for output.

How can I solve this problem? thank you

Originally posted by @BMDACMER in #2 (comment)

Migration to SB3

Are there plans to move this project to SB3 eventually?

realvnf / distributed-drl-coordination Goto Github PK

distributed-drl-coordination's Introduction

Distributed Online Service Coordination Using Deep Reinforcement Learning

Citation

Installation

Usage

Tensorboard

Parallelization

Related Projects

distributed-drl-coordination's People

Contributors

Stargazers

Watchers

Forkers

distributed-drl-coordination's Issues

Recommend Projects

Recommend Topics

Recommend Org