Hello, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Thanks for answering already <a class="user-mention notranslate" data-hovercard-type="

Also for this part: <div class="snippet-clipboard-content notranslate position-rel

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hello, thanks for helping, everything works well. <p di

Limit mobile-env to single-cell selection about mobile-env HOT 8 CLOSED

stefanbschneider commented on August 22, 2024

Limit mobile-env to single-cell selection

from mobile-env.

Comments (8)

stwerner97 commented on August 22, 2024 1

Hi @R0B1NNN1 ☺️

For mobile-env, is it possible to set the number of users per base station instead of we have random users.

I am not sure if I understand your question. Do you want to set a maximum number of possible connections per base station? What do you mean by random users?

In this environment, we assume that each UE can connect to multiple BSs, right? Is it possible to change the settings to each UE can only connect 1 BS.

If I recall our implementation correctly, mobile-env doesn't support this use case out-of-the-box. A simple change to support such a setting could be for us to add some lines to the apply_actionmethod in MComCore.

...
# establish connection if user equipment not connected but reachable
elif self.check_connectivity(bs, ue):
    self.connections[bs].add(ue)

    # <TODO: disconnect ue from all BS other than bs >
    ...

Maybe we could also support a more general action space that defines what connections are active in each step, instead of an action space that activates or deactivates a single connection per UE, i.e., a binary action space of dimension [UEs, BSs] instead of a multi-discrete action space of size [UEs] with values in between[0, BS]. The multi-discrete action space (i.e., the current implementation) is then a special case of the suggested action space (we could provide it with via some wrapper class).

Adopting such an action space could also enable custom handlers, where connecting a single UE to multiple base stations is an invalid action (I think this would cover @R0B1NNN1's suggestion). At the same time, this could be a larger change to mobile-env. What do you think @stefanbschneider? 🙆‍♂️

from mobile-env.

stefanbschneider commented on August 22, 2024 1

Thanks for answering already @stwerner97 :) Here are my comments:

User positions and movement:

if we have 2 base stations and 3 users, the three users will randomly be generated among the BSs, am I correct? My question is: Is that possible to set the number of users per base station? e.g., if we have 2 BSs, I can set 2 UEs per base station (so totally 4 users) which they may only move in the range of the particular BS.

You can configure yourself, where users start and how they move. You can give them a fixed position, eg, close to a BS or randomize it. You can have them standing still or moving around as you wish. Currently, the default is moving around randomly, following a random waypoint model: https://github.com/stefanbschneider/mobile-env/blob/main/mobile_env/core/movement.py
This is in line with most literature.

You can implement your own Movement child class if you want to have custom user movement.

Connecting to a single cell rather than multiple / changed action space:

Like @stwerner97 said, mobile-env currently focuses on multi-cell selection, but could be adjusted for single-cell selection.

@stwerner97 I agree that your suggestion would allow for more general actions. I originally decided against actions that indicate which cells each agent is currently connected to because it would grow the action space by the number of cells.
But usually, we don't have that many cells + if we could have custom handlers, that would be great!

It's just that I don't have the time to implement anything like that. :S Do you? Otherwise, we could give some hints and then @R0B1NNN1 implements themselves what they need?

Other questions:

when using distributed multi-agent control, we have to set how they share the information, right?

RLlib does let you decide whether you want to train a single shared policy/neural network or sepearate ones for each agent.

My question is the GIF shows the progress of learning of the agent or it shows we are testing and rendering the learned policy.

It's just the steps run above, ie, during testing.

from mobile-env.

R0B1NNN1 commented on August 22, 2024

Hello, @stefanbschneider, thanks for replying.

Sorry for the confusion, what I mean is if we have 2 base stations and 3 users, the three users will randomly be generated among the BSs, am I correct? My question is: Is that possible to set the number of users per base station? e.g., if we have 2 BSs, I can set 2 UEs per base station (so totally 4 users) which they may only move in the range of the particular BS.

Do you want to set a maximum number of possible connections per base station?

You mentioned this, may I also ask is that possible?

Regarding my second question, I am not sure how does the cellular network work.

A simple change to support such a setting could be for us to add some lines to the apply_actionmethod in MComCore.

I do not really understand what you mean.

Also, I have another question, if I want to train mobile-env using multi-agent reinforcement learning, say by using ray[rllib], how to deal with the communications between agents, since I saw you mentioned that > It supports both centralized, single-agent control and distributed, multi-agent control , when using distributed multi-agent control, we have to set how they share the information, right? Does ray[lib] supports to do this or I have to write the algorithm by myself.

Thanks for replying in advance, I apologise for having so many questions.

from mobile-env.

R0B1NNN1 commented on August 22, 2024

Also for this part:

from stable_baselines3 import PPO
from stable_baselines3.ppo import MlpPolicy

# create the custom env with the custom handler (obs space) from step 2
env = CustomEnv(config={"handler": CustomHandler})

# train PPO agent on environment. this takes a while
model = PPO(MlpPolicy, env, tensorboard_log='results_sb', verbose=1)
model.learn(total_timesteps=30000)

We are training the agent using PPO algorithm.

For this part:

env = CustomEnv(config={"handler": CustomHandler}, render_mode="rgb_array")
obs, info = env.reset()
done = False

# run one episode with the trained model
while not done:
    action, _ = model.predict(obs)

    # perform step on simulation environment
    obs, reward, terminated, truncated, info = env.step(action)
    done = terminated or truncated

    # render environment as RGB
    plt.imshow(env.render())
    display.display(plt.gcf())
    display.clear_output(wait=True)

My question is the GIF shows the progress of learning of the agent or it shows we are testing and rendering the learned policy.

from mobile-env.

R0B1NNN1 commented on August 22, 2024

@stefanbschneider, @stwerner97:
Thanks for replying. Could you please give me some hints on how to connect to a single cell rather than multiple / changed action space. If I cannot figure it out, I think I can still work on connecting to multiple cells.

For RL lib:

Does RL lib let you customer the information you communicated between agents? Since I did not really find any information about this.

For Multi-Agent RL with Ray RLlib in Colab

May I ask have you figure out the example you provided in the Colab, which is about Multi-Agent RL with Ray RLlib. And also I believe that in the Colab example, I think you are using ray[rllib] = version 2.2.0 maybe, does it mean if I downloaded this version, the example will work?

Thanks a lot for the helping.

from mobile-env.

stefanbschneider commented on August 22, 2024

Could you please give me some hints on how to connect to a single cell rather than multiple / changed action space.

As suggested by @stwerner97 above, the simplest approach is to clone and modify mobile-env as follows:
Inside the apply_action function, UEs are connected to new cells here: https://github.com/stefanbschneider/mobile-env/blob/main/mobile_env/core/base.py#L259

The main idea would be not first remove all existing connections before connecting to a new cell. That way, each UE is always connected to at most one cell at a time.

The resulting code would look like this:

# establish connection if user equipment not connected but reachable
elif self.check_connectivity(bs, ue):
   # NEW: remove all existing connections
   available_bs = self.available_connections(ue)
   for bs_to_remove in available_bs:
      self.connections[bs_to_remove].remove(ue)

   # then connect to new cell
   self.connections[bs].add(ue)

I hope this helps.

In terms of communication between agents in RLlib, I cannot help you, I'm afraid.

Also, I just fixed the Ray RLlib Colab to work with Ray 2.5.1: #38 (comment) :)

from mobile-env.

R0B1NNN1 commented on August 22, 2024

@stefanbschneider ：

Hello, thanks for helping, everything works well. I did centralize training distributed execution and distributed training distributed execution successfully by using mobile-env. If I read the code correctly, currently we assign an agent for each single user, right? May I ask the advantage of this settings and the motivation of this settings. If I want to assign an agent per cell, will it work well as well? Is it hard to change it? And if I want to change it. I have to change the Handler first then I have to also change the step in mobile_env.base.step, am I correct?

Thanks for replying in advance.

from mobile-env.

stefanbschneider commented on August 22, 2024

Hello, thanks for helping, everything works well.

Great to hear! In that case, I'd close the issue if you don't object.

If I read the code correctly, currently we assign an agent for each single user, right? May I ask the advantage of this settings and the motivation of this settings.

You are correct, in the multi-agent setting, we currently have one agent per user.

The reason for that is that it is easier to keep the observation and action space at a fixed size: We have a fixed number of cells that users can choose from in their actions (and that they see in their observations). This is represented by the fixed-size observations and actions.

In contrast, the number of users in the area and close to a cell changes over time as users move around. Still, action and observation space have to stay fixed. This means we need to define a upper limit of max number of users per cell or some other workaround if we want to have an agent per cell.

More details are in the DeepCoMP paper: https://ris.uni-paderborn.de/download/33854/33855
Specifically, section V.A1 is exactly about your question.

If I want to assign an agent per cell, will it work well as well? Is it hard to change it? And if I want to change it. I have to change the Handler first then I have to also change the step in mobile_env.base.step, am I correct?

It should work too but will need some changes in the code of mobile-env. You are right, you would need to add a new and different multi-agent handler.
I would try to keep the mobile_env.base.step() unchanged and do everything in the handler.
The step() expects a dict of actions, where each user has a selected cell for connecting/diconnecting (or dummy). In the handler you could get these actions from the cells instead of the users.

You only have to change the step() if you want to completely change the behavior of dis/connecting each user to/from at most one cell per step and allowing multiple cells.

If you have more questions about this, better create a new issue, so that it is easier for others to follow and find if they have the same question.

from mobile-env.

Limit mobile-env to single-cell selection about mobile-env HOT 8 CLOSED

Comments (8)

For RL lib:

For Multi-Agent RL with Ray RLlib in Colab

Related Issues (18)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent