intelligent-environments-lab / citylearn Goto Github PK
View Code? Open in Web Editor NEWOfficial reinforcement learning environment for demand response and load shaping
License: MIT License
Official reinforcement learning environment for demand response and load shaping
License: MIT License
Hi I noticed that for the competition we are given building_info and observation_space in the online evaluation you could for example see it here
I added this in the OrderEnforcingWrapper
class OrderEnforcingAgent:
"""
Emulates order enforcing wrapper in Pettingzoo for easy integration
Calls each agent step with agent in a loop and returns the action
"""
def __init__(self):
self.num_buildings = None
self.agent = UserAgent()
self.action_space = None
def register_reset(self, observation):
"""Get the first observation after env.reset, return action"""
action_space = observation["action_space"]
self.action_space = [dict_to_action_space(asd) for asd in action_space]
obs = observation["observation"]
self.num_buildings = len(obs)
print(f'building_info_in_keys : {"building_info" in observation.keys()}')
if I look at the logs I see this
Warning: Gym version v0.24.1 has a number of critical issues with `gym.make` such that environment observation and action spaces are incorrectly evaluated, raising incorrect errors and warning . It is recommend to downgrading to v0.23.1 or upgrading to v0.25.1
/srv/conda/envs/notebook/lib/python3.8/site-packages/sklearn/linear_model/_least_angle.py:34: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
method='lar', copy_X=True, eps=np.finfo(np.float).eps,
/srv/conda/envs/notebook/lib/python3.8/site-packages/sklearn/decomposition/_lda.py:28: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
EPS = np.finfo(np.float).eps
/srv/conda/envs/notebook/lib/python3.8/site-packages/sklearn/ensemble/_gb.py:33: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
from ._gradient_boosting import predict_stages
2022-08-15 21:48:35.016 | INFO | aicrowd_gym.clients.base_oracle_client:register_agent:210 - Registering agent with oracle...
2022-08-15 21:48:35.020 | SUCCESS | aicrowd_gym.clients.base_oracle_client:register_agent:226 - Registered agent with oracle
building_info_in_keys : False
Device:cpu
building_info_in_keys : False
Device:cpu
building_info_in_keys : False
Device:cpu
building_info_in_keys : False
Device:cpu
building_info_in_keys : False
Device:cpu
building_info_in_keys : False
Device:cpu
Will they will be added at some point in time or we will have to use only what is passed?
Thank you in advance.
on running the line
env = ss.pettingzoo_env_to_vec_env_v1(citylearn_pettingzoo_env)
I get the error as
AssertionError: observation spaces not consistent. Perhaps you should wrap with supersuit.aec_wrappers.pad_observations
?
when i change to
env = ss.pad_observations_v0(citylearn_pettingzoo_env)
creating env does not throws any error
but on calling the model, it throws error as
File "C:\Users\anuj\Anaconda3\envs\city_challenge\lib\site-packages\stable_baselines3\common\vec_env\util.py", line 74, in obs_space_info
shapes[key] = box.shape
AttributeError: 'function' object has no attribute 'shape'
Anyone knows how we can train models from sb3 for citylearn
Hi, I am running the main.py. I printed the reward and observed that the agent is getting inconsistent rewards. Is that normal or I am missing something?
Battery.energy_balance() starts with a call to super().energy_balance().
In both calls, efficiency penalties are applied to the energy balance, so they are applied twice. I think that is wrong, but it should be an easy fix in the Battery module.
When the battery is already 100%, the capacity at the next time step will be less than the capacity at which it reached 100% because of degradation at previous time step. So, the normalized soc > 1.0. When calculating the max input/output power from the power curve at normalized soc > 1.0, it outputs the value as if soc << 1.0 (maximum output). This will be the case until the soc loss as a result of the loss coefficient brings the soc below the degraded capacity which will happen in a few of time steps.
Most obvious with a random action agent but an intelligent agent could learn the behavior overtime and just avoid sending large discharge actions when soc == 1.0
To fix the bug, make sure this line always evaluates 0.0 <= normalized_soc <= 1.0
I am using CiryLearn, MARLISA example with my data, and I faced an error as you can see in the picture! I checked all my input data and there is no NaN and infinity!
So, the NaN or infinity value should be produced during the marlisa.py line 300 as seen in the error list.
I worked on it solving this error for several days but I could not! Could you please help me with this? What can I fix this error?
)
Good morning,
I am working on a fully automated integration of CityLearn with StableBaselines3 and other Gyms agents,
installing citylearn with pip:
pip install citylearn
i have found several problems and i'd like to request some help.
The folders on the "data" folder:
have the wrong name on the weather.csv file, it should be weather_data.csv acording to the schema.json of the same folders.
Executing the environment created with
from citylearn.citylearn import CityLearnEnv
env = CityLearnEnv(schema="citylearn_challenge_2020_climate_zone_1")
There are NaN values at the end of every vector of the observation:
print(observation)
>>>
[
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 21.88, 40.35, 70.91, 11.447196, 1.0, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 22.94, 33.11, 11.41, 1.0, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 21.05, 43.22, 7.61, 0.9813014190740775, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 20.92, 41.61, 1.55, 3.815732, 0.9813886486921994, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 22.57, 41.81, 16.8, 2.3848325, 1.0, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 21.95, 43.22, 12.8, 1.907866, 1.0, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 23.18, 41.62, 12.3, 0.9266858922799234, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 22.94, 41.5, 21.0, 0.9904502885063199, 0.0, 0.0, 0.0, nan],
[1, 4, 9, 11.04, 16.74, 12.48, 7.76, 80.12, 55.33, 67.31, 85.33, 115.19, 164.94, 0.0, 88.31, 16.88, 413.68, 0.0, 450.45, 0.5453005045, 23.1, 41.84, 10.1, 1.0, 0.0, 0.0, 0.0, nan]
]
and that lead to the following error if i use it with stable_baselines3 PPO
Traceback (most recent call last):
File "\citylearn_playground\citylearn_sb3.py", line 70, in <module>
agent.learn(total_timesteps=100)
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\ppo\ppo.py", line 317, in learn
return super().learn(
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 262, in learn
continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 172, in collect_rollouts
actions, values, log_probs = self.policy(obs_tensor)
File "\citylearn_playground\venv\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\common\policies.py", line 590, in forward
distribution = self._get_action_dist_from_latent(latent_pi)
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\common\policies.py", line 606, in _get_action_dist_from_latent
return self.action_dist.proba_distribution(mean_actions, self.log_std)
File "\citylearn_playground\venv\lib\site-packages\stable_baselines3\common\distributions.py", line 153, in proba_distribution
self.distribution = Normal(mean_actions, action_std)
File "\citylearn_playground\venv\lib\site-packages\torch\distributions\normal.py", line 56, in __init__
super(Normal, self).__init__(batch_shape, validate_args=validate_args)
File "\citylearn_playground\venv\lib\site-packages\torch\distributions\distribution.py", line 56, in __init__
raise ValueError(
ValueError: Expected parameter loc (Tensor of shape (1, 9)) of distribution Normal(loc: torch.Size([1, 9]), scale: torch.Size([1, 9])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan]], device='cuda:0')
Process finished with exit code 1
The package installed with pip is considerably diferent with the one found on this repository. Is the pip package not "oficial" ?
what is the recomended way to install CityLearn for usage?
this is the class i am using to transform the action and observation space for gym, if there is an oficial or better way, i whould like to ask for a bit of help
from citylearn.citylearn import CityLearnEnv
from stable_baselines3 import PPO
from stable_baselines3.ppo import MlpPolicy
import gym
import numpy as np
class EnvCityGym(gym.Env):
def __init__(self, env):
self.env = env
self.num_envs = 1
# get the number of buildings
self.num_buildings = len(env.action_spaces)
self.act_lows = np.array([])
self.act_highs = np.array([])
for uid in env.buildings_states_actions:
#print(env.buildings_states_actions[uid]["actions"])
#print(sum(env.buildings_states_actions[uid]["actions"].values()))
self.act_lows = np.concatenate((self.act_lows, np.array([-1] * sum(env.buildings_states_actions[uid]["actions"].values())),))
self.act_highs = np.concatenate((self.act_highs, np.array([1] * sum(env.buildings_states_actions[uid]["actions"].values())),))
# define action and observation space
#log.debug(self.act_lows)
#log.debug(self.act_highs)
self.action_space = gym.spaces.Box(low=self.act_lows,
high=self.act_highs, dtype=np.float32)
self.obs_lows = np.array([])
self.obs_highs = np.array([])
for obs_box in env.observation_spaces:
self.obs_lows = np.concatenate((self.obs_lows, obs_box.low))
self.obs_highs = np.concatenate((self.obs_highs, obs_box.high))
self.observation_space = gym.spaces.Box(low=self.obs_lows, high=self.obs_highs,
dtype=np.float32)
def reset(self):
obs = self.env.reset()
observation = self.get_observation(obs)
return observation
def get_observation(self, obs):
obs_list = np.array([])
for obs_box in obs:
obs_list = np.concatenate((obs_list, obs_box))
print(obs)
#obs_list = np.nan_to_num(obs_list) #This removes the nan from the observation but does not solve the issue
print(obs_list)
return obs_list
def step(self, action):
action = [[act] for act in action]
obs, reward, done, info = self.env.step(action)
observation = self.get_observation(obs)
return observation, sum(reward), done, info
def render(self, mode='human'):
return self.env.render(mode)
if __name__ == "__main__":
import torch as th
th.autograd.set_detect_anomaly(True)
city_env = CityLearnEnv(schema="citylearn_challenge_2020_climate_zone_1")
env = EnvCityGym(city_env)
agent = PPO(policy=MlpPolicy, env=env)
agent.learn(total_timesteps=100)
state = env.reset()
done = False
action, coordination_vars = agent.select_action(state)
while not done:
next_state, reward, done, _ = env.step(action)
action_next, coordination_vars_next = agent.select_action(next_state)
coordination_vars = coordination_vars_next
state = next_state
action = action_next
env.cost()
I have an error when I run example module for RBC and SAC showing AttributeError: type object 'CostFunction' has no attribute 'net_electricity_consumption'
When I run quickstart example it's run fine without any error.
Thank you
Is your feature request related to a problem? Please describe.
The method of internally estimating action and observation space limits though generalized enough, does not always provide the best limits.
Describe the solution you'd like
Describe alternatives you've considered
NIL
Additional context
NIL
ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_23588\1249709747.py in
8 env = StableBaselines3Wrapper(env)
9 model = SAC('MlpPolicy', env)
---> 10 model.learn(total_timesteps=env.time_steps*2)
11
12 # evaluate
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\sac\sac.py in learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, progress_bar)
311 tb_log_name=tb_log_name,
312 reset_num_timesteps=reset_num_timesteps,
--> 313 progress_bar=progress_bar,
314 )
315
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\common\off_policy_algorithm.py in learn(self, total_timesteps, callback, log_interval, tb_log_name, reset_num_timesteps, progress_bar)
304 reset_num_timesteps,
305 tb_log_name,
--> 306 progress_bar,
307 )
308
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\common\off_policy_algorithm.py in _setup_learn(self, total_timesteps, callback, reset_num_timesteps, tb_log_name, progress_bar)
287 reset_num_timesteps,
288 tb_log_name,
--> 289 progress_bar,
290 )
291
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\common\base_class.py in _setup_learn(self, total_timesteps, callback, reset_num_timesteps, tb_log_name, progress_bar)
422 assert self.env is not None
423 # pytype: disable=annotation-type-mismatch
--> 424 self._last_obs = self.env.reset() # type: ignore[assignment]
425 # pytype: enable=annotation-type-mismatch
426 self._last_episode_starts = np.ones((self.env.num_envs,), dtype=bool)
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py in reset(self)
74 def reset(self) -> VecEnvObs:
75 for env_idx in range(self.num_envs):
---> 76 obs, self.reset_infos[env_idx] = self.envs[env_idx].reset(seed=self._seeds[env_idx])
77 self._save_obs(env_idx, obs)
78 # Seeds are only used once
e:\Anaconda\envs\pc\lib\site-packages\stable_baselines3\common\monitor.py in reset(self, **kwargs)
81 raise ValueError(f"Expected you to pass keyword argument {key} into reset")
82 self.current_reset_info[key] = value
---> 83 return self.env.reset(**kwargs)
84
85 def step(self, action: ActType) -> Tuple[ObsType, SupportsFloat, bool, bool, Dict[str, Any]]:
e:\Anaconda\envs\pc\lib\site-packages\shimmy\openai_gym_compatibility.py in reset(self, seed, options)
239 )
240
--> 241 obs = self.gym_env.reset()
242
243 if self.render_mode == "human":
e:\Anaconda\envs\pc\lib\site-packages\gym\core.py in reset(self, **kwargs)
321 def reset(self, **kwargs) -> Tuple[ObsType, dict]:
322 """Resets the environment with kwargs."""
--> 323 return self.env.reset(**kwargs)
324
325 def render(
e:\Anaconda\envs\pc\lib\site-packages\gym\core.py in reset(self, **kwargs)
377 def reset(self, **kwargs):
378 """Resets the environment, returning a modified observation using :meth:self.observation
."""
--> 379 obs, info = self.env.reset(**kwargs)
380 return self.observation(obs), info
381
e:\Anaconda\envs\pc\lib\site-packages\gym\core.py in reset(self, **kwargs)
321 def reset(self, **kwargs) -> Tuple[ObsType, dict]:
322 """Resets the environment with kwargs."""
--> 323 return self.env.reset(**kwargs)
324
325 def render(
e:\Anaconda\envs\pc\lib\site-packages\gym\core.py in reset(self, **kwargs)
321 def reset(self, **kwargs) -> Tuple[ObsType, dict]:
322 """Resets the environment with kwargs."""
--> 323 return self.env.reset(**kwargs)
324
325 def render(
e:\Anaconda\envs\pc\lib\site-packages\gym\core.py in reset(self, **kwargs)
377 def reset(self, **kwargs):
378 """Resets the environment, returning a modified observation using :meth:self.observation
."""
--> 379 obs, info = self.env.reset(**kwargs)
380 return self.observation(obs), info
381
ValueError: not enough values to unpack (expected 2, got 1)
I ran quickstart.ipynb without making any changes and it throws this error, may I ask why?
the site of quickstart.ipynb is https://github.com/intelligent-environments-lab/CityLearn/blob/master/examples/quickstart.ipynb
@kingsleynweye Kingsley Nweye
When i run:
"
from citylearn.citylearn import CityLearnEnv
from citylearn.agents.sac import SAC as RLAgent
dataset_name = 'baeda_3dem'
env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=10)
model = RLAgent(env)
model.learn(episodes=2, deterministic_finish=True)
"
I get this:
"
obs: [-2.44929360e-16 1.00000000e+00 1.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
0.00000000e+00 0.00000000e+00 5.40640817e-01 8.41253533e-01
2.39775427e-01 -1.93590194e-08 6.18397155e-01 3.73208446e-01
0.00000000e+00 5.10551796e-01 0.00000000e+00 0.00000000e+00
4.57856874e-09 4.57856874e-09 4.57856874e-09 4.57856874e-09
4.95913909e-01 0.00000000e+00 0.00000000e+00 3.75607514e-01]
mean: None
std: None
"
with the trace back:
"---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in get_normalized_observations(self, index, observations)
230 try:
--> 231 return (np.array(observations, dtype = float) - self.norm_mean[index])/self.norm_std[index]
232 except:
TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'
During handling of the above exception, another exception occurred:
AssertionError Traceback (most recent call last)
in
5 env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=10)
6 model = RLAgent(env)
----> 7 model.learn(episodes=2, deterministic_finish=True)
8
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/base.py in learn(self, episodes, keep_env_history, env_history_directory, deterministic, deterministic_finish, logging_level)
139
140 while not self.env.done:
--> 141 actions = self.predict(observations, deterministic=deterministic)
142
143 # apply actions to citylearn_env
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in predict(self, observations, deterministic)
188
189 if self.time_step > self.end_exploration_time_step or deterministic:
--> 190 actions = self.get_post_exploration_prediction(observations, deterministic)
191
192 else:
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in get_post_exploration_prediction(self, observations, deterministic)
204 for i, o in enumerate(observations):
205 o = self.get_encoded_observations(i, o)
--> 206 o = self.get_normalized_observations(i, o)
207 o = torch.FloatTensor(o).unsqueeze(0).to(self.device)
208 result = self.policy_net[i].sample(o)
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in get_normalized_observations(self, index, observations)
236 print('std:',self.norm_std[index])
237 print(self.time_step, self.standardize_start_time_step, self.batch_size, len(self.replay_buffer[0]))
--> 238 assert False
239
240 def get_encoded_observations(self, index: int, observations: List[float]) -> npt.NDArray[np.float64]:
AssertionError:
"
Please describe what you expected to happen.
Please describe what actually happened.
Please provide detailed steps to reproduce the issue.
If you have any ideas for how to fix the issue, please describe them here.
Please provide any additional information that may be helpful in resolving this issue.
As a CityLearn user, I want to be able to take advantage of stable baselines3 reliable implementations of RL algorithms to enable me easily evaluate my environment on a diverse set of algorithms and benchmark the performance of the algorithms.
Changes can be made to the environment as long as that the evaluation criteria below are met.
Hi,
I tried to run "citylearn_sb3.py" file, but recieved below error:
File "anaconda3/envs/citylearn/lib/python3.8/site-packages/supersuit/vector/markov_vector_wrapper.py", line 22, in __init__
assert all(
AssertionError: observation spaces not consistent. Perhaps you should wrap with `supersuit.aec_wrappers.pad_observations`?
There might be a problem on supersuit version. Which version do you use?
Thank you.
Line 373 in 8245a54
max_cooling
is defined as min(max_electric_power, self.nominal_power)*self.cop_cooling[self.time_step]
. But max_heating
is defined as same. Is it supposed to be cop_heating
?
The datasets in Climate_zone 1-4 miss files about carbbon_intensity.csv; thus, the env can't be initialized by Climate_zone 1-4. And it said in the paper "MARLISA: Multi-Agent Reinforcement Learning with Iterative Sequential Action Selection for Load Shaping of Grid-Interactive Connected Buildings" that the datasets in climate zone 2A contain five years datasets. But in fact, there are no datasets of climate zone containing more than five years in this project. I am confused about the datasets used in papers. I would be appreciated if you can upload a dataset explanation document and missing files carbbon_intensity.csv.
Is your feature request related to a problem? Please describe.
Need to be able to consider EV loads for load V2G and G2V application.
Describe the solution you'd like
Integrate the work by @calofonseca in a future CityLearn release:
Describe alternatives you've considered
NIL
Additional context
NIL
I have a question about how to integrate prediction and control models together.
I think using a prediction model, we can measure the sensor input and control input, and then we can predict future energy consumption. For the control problem, we generate an optimal control decision based on the state observation. As we can see, there are two tracks, prediction and control:
https://www.aicrowd.com/challenges/neurips-2023-citylearn-challenge.
Based on my knowledge, I think the prediction task is to learn a simulator that could be used for RL training.
When I learned this demo:
from citylearn.agents.rbc import BasicRBC as RBCAgent
from citylearn.citylearn import CityLearnEnv, EvaluationCondition
import citylearn
dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=1000)
model = RBCAgent(env)
model.learn(episodes=4)
## print cost functions at the end of episode
kpis = model.env.evaluate(baseline_condition=EvaluationCondition.WITHOUT_STORAGE_BUT_WITH_PARTIAL_LOAD_AND_PV)
kpis = kpis.pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
print(kpis)
print(citylearn.data.DataSet.get_names())
i think it directly trains the model on the dataset, I am a bit confused, should it build a prediction model first, then train the RL agent? I think it is directly trained on the dataset here: https://github.com/intelligent-environments-lab/CityLearn/tree/master/citylearn/data. So, do we really need a prediction model?
Dear all,
First of all, thank you for creating and maintaining such interesting open source OpenAI Gym environment for MARL as a way to standardize development in the area. I'm currently working on a V2G optimization multi-agent architecture and while doing the state-of-the-art research I've come to find CityLearn. As far as I understand a significant set of assets are already implemented for the OpenAI Gym, including stationary batteries. But I think it would be interesting to add vehicle batteries and their specific modelation.
For example, adding specificities such as State of Charge (SOC) on arrival, requested SOC of EV at departure, requested departure hour, typical arrival and departure date time, maximum EV charger efficiency, among others. I think the citylearn.energy_model.Battery already models a big part of the batteries and so I think adding V2G to the environment would be a very interesting step forward.
Are there any plans to implement such elements ?
Best Regards,
Tiago Fonseca
Line 592 in b451f05
when using central agent, the line referenced above breaks the code because it can't recognize electrical_storage_soc
state. When modifying the line above such that it takes that state name we have a state size of 102. But when doing env.reset()
, we have a state size of 93. So, the electrical_storage_soc
is missing from state, which is expected.
However, in building_loader
, you have excluded electrical_storage_soc
which explains why adding the same conditional fixes the bug to create the agent (bug comes from reset
) but still the state space is 102 while as env.reset
is still 93.
Since you understand the environment better, I must be missing something silly.
Please lmk if you find the issue.
I've trained a PPO using 20 discrete actions (controlling electrical SOC) and I'm trying to explain what each action does, is there a way to map them back to the continuous space from [-1.0, 1.0]? I'm assuming 0 is a full charge and equivalent to 1.0, and 19 to -1.0, but how could I map out the intermediate values?
Thanks!
Is your feature request related to a problem? Please describe.
I am trying to test a pricing algorithm using CityLearn platform. I am wondering if there are ways to access the planned states and actions of each building, or whether the environment has planned states and actions?
Describe the solution you'd like
Describe alternatives you've considered
I have considered randomly initializing random planned states for each building if the environment doesn't have one already.
Additional context
Thank you!
In energy_model.py line 554, should be "assert 0 <= loss_coefficient <= 1, 'loss_coefficient must be >= 0 and <= 1.'"
Hello, thanks for providing this environment.
I took part in the 2022 challenge and I am looking at the 2021 environment. I'm confused on the right way to evaluate the agent. In the 2022 challenge we have the env.evaluate() function, in the 2021 environment it seems that the env.cost() method is used to evaluate the agent (from the challenge page), but it doen't seem to exist anymore.
Do we have to use the cost functions in the citylearn.cost_function file and implement our own cost function ?
Thank you !
citylearn/agents/base.py
has this call to update:
self.update(observations, actions, rewards, next_observations, done=done)
but
citylearn/agents/q_learning.py
update has this definition:
def update(self, observations: List[List[float]], actions: List[List[float]], reward: List[float], next_observations: List[List[float]])
which does not have an done in the update which results in this error:
File ~/.conda/envs/citytest310c/lib/python3.10/site-packages/citylearn/agents/base.py:155, in Agent.learn(self, episodes, deterministic, deterministic_finish, logging_level)
153 # update
154 if not deterministic:
--> 155 self.update(observations, actions, rewards, next_observations, done=done)
156 else:
157 pass
TypeError: TabularQLearning.update() got an unexpected keyword argument 'done'
if you run examples/citylearn_rlem23_tutorial.ipynb
with a current version of citylearn.
The example to work with a current version of citylearn.
The above error.
Use the code in examples/citylearn_rlem23_tutorial.ipynb
with citylearn 2.1.0
Either remove the done=done
from base.py, or add back the done parameter removed in 3b562b9
I realize the example is targeted to version 1.8.0, but I am not sure how you would use TabularQLearning in version 2.1.0 without triggering this issue.
Thanks.
After I define my own reward function and updating the scheme in the source code following this link: https://www.citylearn.net/overview/reward_function.html?highlight=custom_module
I keeps getting the "ModuleNotFoundError: No module named 'custom_module'" error when defining env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=WINDOW*14)
Please describe what you expected to happen.
Please describe what actually happened.
After following the above link, I run:
dataset_name = 'citylearn_challenge_2022_phase_1'
WINDOW = 24
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=WINDOW*14)
This would give the error
If you have any ideas for how to fix the issue, please describe them here.
Please provide any additional information that may be helpful in resolving this issue.
Hi, I am interested in this project and I have tried running main.py. I experience high ram usage that causes my whole computer to freeze. Is that normal? How do I make use of my cuda enabled GPU to compute? Is there a memory leak? Please advise thankyou. I am using Ubuntu 18.04.
When I am trying to reproduce the results from Quickstart section - Decentralized-Independent SAC, I got TypeError with detailed message:
"---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in
8 env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=1000)
9 model = RLAgent(env)
---> 10 model.learn(episodes=1, deterministic_finish=True)
11
12 # print cost functions at the end of episode
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/base.py in learn(self, episodes, keep_env_history, env_history_directory, deterministic, deterministic_finish, logging_level)
139
140 while not self.env.done:
--> 141 actions = self.predict(observations, deterministic=deterministic)
142
143 # apply actions to citylearn_env
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in predict(self, observations, deterministic)
183
184 if self.time_step > self.end_exploration_time_step or deterministic:
--> 185 actions = self.get_post_exploration_prediction(observations, deterministic)
186
187 else:
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in get_post_exploration_prediction(self, observations, deterministic)
199 for i, o in enumerate(observations):
200 o = self.get_encoded_observations(i, o)
--> 201 o = self.get_normalized_observations(i, o)
202 o = torch.FloatTensor(o).unsqueeze(0).to(self.device)
203 result = self.policy_net[i].sample(o)
~/opt/anaconda3/lib/python3.8/site-packages/citylearn/agents/sac.py in get_normalized_observations(self, index, observations)
224 def get_normalized_observations(self, index: int, observations: List[float]) -> npt.NDArray[np.float64]:
225 # try:
--> 226 return (np.array(observations, dtype = float) - self.norm_mean[index])/self.norm_std[index]
227 # except:
228 # # print("unable to get normalized observations")
TypeError: unsupported operand type(s) for -: 'float' and 'NoneType'"
Function get_normalized_observations is supposed to normalize the observations.
I printed norm_mean and norm_std within get_normalized_observations from sac.py and found that they are all None, from the initialization.
I just copied the code from Quickstart section:
from citylearn.citylearn import CityLearnEnv
from citylearn.agents.sac import SAC as RLAgent
dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=False, simulation_end_time_step=1000)
model = RLAgent(env)
model.learn(episodes=2, deterministic_finish=True)
If you have any ideas for how to fix the issue, please describe them here.
Please provide any additional information that may be helpful in resolving this issue.
Hi, when I create a citylearn environment: env = CityLearnEnv(schema='.data/citylearn_challenge_2022_phase_1/schema.json')
, it will raise this error. Here is my citylearn installing instruction: pip install git+https://github.com/intelligent-environments-lab/CityLearn.git@citylearn_2022
Downloading the quickstart jupyter notebook resulted in an attribute error after running the second code cell. The error originates from the evaluate function, where, apparently, a building in the schema doesn't have the 'net_electricity_consumption_without_storage_and_partial_load' attribute.
Expected an output of the cost functions.
Got the following error:
CityLearnEnv.evaluate() raises AttributeError: 'Building' object has no attribute 'net_electricity_consumption_without_storage_and_partial_load'.
pip install CityLearn==2.0.0
from citylearn.citylearn import CityLearnEnv
from citylearn.agents.rbc import BasicRBC as RBCAgent
dataset_name = 'citylearn_challenge_2022_phase_1'
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=1000)
model = RBCAgent(env)
model.learn(episodes=1)
# print cost functions at the end of episode
kpis = model.env.evaluate().pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
x
Tested it outside of jupyter notebook as well, and got the same error. I played around by testing some different agents, but didn't make a difference.
Changing the dataset to 'baeda_3dem' got rid of the error. This gave me the impression the error likely has to do with the schema.json.
However, given that the error happened through basic use of the citylearn package I wouldn't be surprised if the mistake was on my part.
I'm getting an error when I try to reproduce the examples/tutorial.ipynb operation.
Traceback (most recent call last):
File "E:\PycharmProjects\citylearn\tutorail.py", line 709, in
_ = tql_model.learn(episodes=tql_episodes)
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\citylearn\agents\base.py", line 150, in learn
next_observations, rewards, done, _ = self.env.step(actions)
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\gym\core.py", line 319, in step
return self.env.step(action)
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\gym\core.py", line 456, in step
return self.env.step(self.action(action))
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\gym\core.py", line 456, in step
return self.env.step(self.action(action))
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\gym\core.py", line 380, in step
observation, reward, terminated, truncated, info = self.env.step(action)
File "E:\PycharmProjects\citylearn\venv\lib\site-packages\gym\core.py", line 380, in step
observation, reward, terminated, truncated, info = self.env.step(action)
ValueError: not enough values to unpack (expected 5, got 4)
The environment is the same as the tutorial configuration
python:3.9
I tried to change the number of variables in base.py corresponding to the source code, but he gets another error.
What should I do?
Hi, I think the day_type returned by get_periodic_observation_metadata function in building.py might be wrong
def get_periodic_observation_metadata(self) -> Mapping[str, int]:
r"""Get periodic observation names and their minimum and maximum values for periodic/cyclic normalization.
Returns
-------
periodic_observation_metadata : Mapping[str, int]
Observation low and high limits.
"""
return {
'hour': range(1, 25),
'day_type': range(1, 9),
'month': range(1, 13)
}
According to the description written in document, day of week ranging from 1 (Monday) through 7 (Sunday). I think the correct range of day_type should be 1 to 7. And it can be verified since I cannot find 8 in day_type when I check the building csv files.
def get_periodic_observation_metadata(self) -> Mapping[str, int]:
r"""Get periodic observation names and their minimum and maximum values for periodic/cyclic normalization.
Returns
-------
periodic_observation_metadata : Mapping[str, int]
Observation low and high limits.
"""
return {
'hour': range(1, 25),
'day_type': range(1, 8),
'month': range(1, 13)
}
Thank you for sharing the repo. I have a question about reproducing Marlisa result.
In the paper, it says:
MARLISA performed constrained random action exploration for the first 250 days of the simulation. Then, it performed an exploration-exploitation process using SAC to maximize the expected rewards and the entropy of the policy (for 300 more days). Finally, after 550 days into the simulation, MARLISA started to evaluate the stochastic policy deterministically (by choosing the mean value of the policy rather than sampling from it).
So, in order to reproduce Marlisa, I have to:
'start_training': 6000
and exploration_period:6000
(250 days * 24 hours), 'safe_exploration':False
since with True it is controlled by RBC'is_evaluate' = True
, so that Marlisa gives a deterministic policyAm I right?
When I run
python main.py
I get this error, seems the interface is broken, please fix it
Traceback (most recent call last):
File "/Users/matthewd/PycharmProjects/CityLearn/main.py", line 37, in
agents = Agent(**params_agent)
TypeError: init() got an unexpected keyword argument 'observation_spaces'
Hi,
I tried to find the solution on this website. However, when I clicked view of the leaderboard winner and then clicked the REPO_URL has expired and I couldn't find any solution.
Where can we find a winner solution for CityLearn Challenge 2022 or solution for previous challenges? Or is there a competition solution paper available?
Thank you!!
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
gymansium
environmentDescribe alternatives you've considered
NIL
Additional context
NIL
Hi everyone,
I was doing some experimentation and I think I faced an issue that is outside of my control. When getting the reward from env.step
method I'm getting this error:
ValueError: operands could not be broadcast together with shapes (8760,) (8761,)
It was raised when trying to compute the net_electricity_consumption function (the original) in the building.py line 344.
Could anyone give me a hint about how to solve this issue?
Thanks in advance!
The list of observations names is smaller than the observation space returned by the environment. My observation space has 31 elements, but I only have the names for 28, and don't know which are unnamed.
I expect the environment would provide the name of each observation/feature in the observation space, so there'd be a name for every feature in the observation space.
The environment does not provide the name of each observation/feature in the observation space, as there are fewer names than features
from citylearn.citylearn import CityLearnEnv
from citylearn.wrappers import NormalizedObservationWrapper, StableBaselines3Wrapper, DiscreteActionWrapper
from citylearn.data import DataSet
dataset_name = 'citylearn_challenge_2022_phase_1'
schema = DataSet.get_schema(dataset_name)
env = CityLearnEnv(schema,
central_agent=True,
buildings='building_1')
env = DiscreteActionWrapper(env)
env = NormalizedObservationWrapper(env)
env = StableBaselines3Wrapper(env)
print(len(env.observation_names)) #28
print(env.observation_space.shape[0]) #31
Sorry, no idea
env.observation_names lists the observations which are active in the schema, but the env lists a larger observation space.
Hi, I think there is a bug in citylearn.reward_function.MARL
.
Please provide a brief description of the issue.
When I use MARL reward function in file citylearn.reward_function
as my reward function, bug appears at line 64 of this file.
It should give the maximum number between 0 and district_electricity_consumption.
TypeError: 'numpy.float64' object cannot be interpreted as an integer
Just replace the reward function to MARL, and then the bug appears.
Simply add parentheses inside the np.nanmax
.
reward = np.sign(building_electricity_consumption)*0.01*building_electricity_consumption**2*np.nanmax((0, district_electricity_consumption))
Issue Description
I am encountering an error while running the code provided in the official CityLearn documentation. I have not modified a single line of the code, and I'm using the exact code snippet provided.
Expected Behavior
I expected the code to run without errors, as it's directly taken from the official CityLearn documentation.
Actual Behavior
I am facing the following error:
ValueError: not enough values to unpack (expected 2, got 1)
Steps to Reproduce
Install CityLearn (version 2.0b4), Stable Baselines 3 (version 2.0.0), and Gym (version 0.26.1).
Run the code provided in the official CityLearn documentation:(https://www.citylearn.net/quickstart.html)
Code:
from stable_baselines3.sac import SAC
from citylearn.citylearn import CityLearnEnv
from citylearn.wrappers import NormalizedObservationWrapper, StableBaselines3Wrapper
dataset_name = 'baeda_3dem'
env = CityLearnEnv(dataset_name, central_agent=True, simulation_end_time_step=1000)
env = NormalizedObservationWrapper(env)
env = StableBaselines3Wrapper(env)
model = SAC('MlpPolicy', env)
model.learn(total_timesteps=env.time_steps*2)
observations = env.reset()
while not env.done:
actions, _ = model.predict(observations, deterministic=True)
observations, _, _, _ = env.step(actions)
kpis = env.evaluate().pivot(index='cost_function', columns='name', values='value')
kpis = kpis.dropna(how='all')
display(kpis)
Environment
CityLearn version: 2.0b4
Operating System: Windows
Python version: 3.10
Possible Solution
I have tried various solutions and referred to the official documentation, but I am unable to find a compatible version combination that resolves this issue. I'm looking for guidance from the community.
Additional Notes
I'm following the instructions exactly as provided in the official documentation, so I'm puzzled as to why I'm encountering this issue. If anyone has experience with these libraries and can provide guidance or suggestions, I would greatly appreciate it.
Thank you for your time and assistance!
Divisor in SOC should be capacity before any degradation for electrical_storage.
Is your feature request related to a problem? Please describe.
Need a way to define thermostat operation and schedules especially when there is occupant interaction.
Describe the solution you'd like
Create a thermostat class with basic functions like update setpoint, apply hold, sense occupancy, revert hold.
Describe alternatives you've considered
Implementing the thermostat logic directly in the building class when setpoint is updated but does not generalize well especially for custom implementations.
Additional context
NIL
When resetting the env, the capacities of electrical_storage of each building doesn't reset.
Hi, thank you for sharing this repo.
I was trying to experiment with the CityLearn environment and Marlisa agent, then I found the dimension of states are varied with different commands.
For example, while doing env.observation_space.shape[0]
the return value is 91, however when I do a env.reset()
the dimension of state is (28,9), I think the 9 is the building amount. Furthermore, if I save the states in the replay buffer of one single bulding, the dimension becomes 36.
I am quite confused, what was the dimension of states used in CityLearn challenge and Marlisa paper, etc.?
As a CityLearn developer, I want to know when a functionality in CityLearn fails after a change has been made to the source code or its dependencies' versions so that I can easily debug the problem and release a new CityLearn version.
As a CityLearn developer, I want to speed up the training of RL agents so that I can use fewer HPC resources for simulations, train for longer episodes and scale up my district size.
This enhancement only applies to the internally defined RL agents in CityLearn:
The citylearn.py, building.py and energy_model.py can also benefit from source code optimization for speed.
One approach can be to profile the simulation of the SAC agent example in example.ipynb.
In last years competition, we were allowed to submit an optional file for model weight and policy params. There was no information on the same for this years' competition. Can we submit a pre-trained agent, i.e. include the weight file?
make sure net_electricity_consumption is always an active state when using MARLISA agents and that the agent algorithm is aware of the position of net_electricity_consumption in the list of observation values. Important since net_electricity_consumption needs to be identified as the predicted value for the internal regression model.
Is your feature request related to a problem? Please describe.
I want to be able to use RLlib library with CityLearn
Describe the solution you'd like
Describe alternatives you've considered
NIL
Additional context
NIL
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.