ziyuanma / dhc Goto Github PK

View Code? Open in Web Editor NEW

53.0 53.0 25.0 10.05 MB

Distributed Heuristic Multi-Agent Path Finding with Communication - ICRA 2021

Python 100.00%

dhc's People

Contributors

Stargazers

Watchers

dhc's Issues

Question of Worker.py

In Actor.run, why only takes the first agent, but the others, second, third...

I have a question about scalability.

Hi, if I train a model in an environment containing 5 agents, can I test the model in a new environment containing 10 agents without any change of the code of the model?

Unreachable agent targets might be generated.

        # ... line 175 at file enviorment.py
        for i in range(self.num_agents):

            pos_idx = random.randint(0, pos_num - 1)
            partition_idx = 0
            for partition in partition_list:
                if pos_idx >= len(partition):
                    pos_idx -= len(partition)
                    partition_idx += 1
                else:
                    break

            pos = random.choice(partition_list[partition_idx])
            partition_list[partition_idx].remove(pos)
            self.agents_pos[i] = np.asarray(pos, dtype=np.int)

            pos = random.choice(partition_list[partition_idx])
            partition_list[partition_idx].remove(pos)
            self.goals_pos[i] = np.asarray(pos, dtype=np.int)

            partition_list = [partition for partition in partition_list if len(partition) >= 2]
            pos_num = sum([len(partition) for partition in partition_list])

This might not be a right way to generate positions for the agents. Because the connectivity of the partition may be disrupted after the addition of the agents. Which may generate unreachable agent targets.

The TD error calculation seems incorrect.

The TD Error calculation at file shows

        q_max = np.max(self.q_buf[:self.size], axis=1)
        ret = self.rew_buf.tolist() + [ 0 for _ in range(configs.forward_steps-1)]
        reward = np.convolve(ret, [0.99**(configs.forward_steps-1-i) for i in range(configs.forward_steps)],'valid')+q_max

I think this must be add

     q_max = q_max * (config.gamma ** config.forwards_steps)

Cannot reproduce the performance of the trained model

Hi, thank you for this repo!

I used the hyperparameters from configs.py for training, but after 150k+ steps, the model's performance has nearly changed (and is still bad), and the loss seems to be nearly the same as well (~ 0.0042)

I was wondering if you have made any changes to the parameters and if you would be able to share them

Thanks!

A bug in environment.py?

Hello! I want to consult you about a small problem?

Is there a bug in the line 447: obs[i, 0] = agent_map[x:x+2*self.obs_radius+1, y:y+2*self.obs_radius+1]?

I think it may be this: obs[i, 0] = agent_map[x-self.obs_radius : x+self.obs_radius+1, y-self.obs_radius : y+self.obs_radius+1]?

The code is in the observe function below.

def observe(self):
        '''
        return observation and position for each agent

        obs: shape (num_agents, 11, 2*obs_radius+1, 2*obs_radius+1)
            layer 1: agent map 
            layer 2: obstacle map
            layer 3-6: heuristic map
            layer 7-11: one-hot representation of agent's last action
        
        pos: used for caculating communication mask

        '''
        obs = np.zeros((self.num_agents, 6, 2*self.obs_radius+1, 2*self.obs_radius+1), dtype=np.bool)

        # 0 represents obstacle to match 0 padding in CNN 
        obstacle_map = np.pad(self.map, self.obs_radius, 'constant', constant_values=0)

        agent_map = np.zeros((self.map_size), dtype=np.bool)
        agent_map[self.agents_pos[:,0], self.agents_pos[:,1]] = 1
        agent_map = np.pad(agent_map, self.obs_radius, 'constant', constant_values=0)

        for i, agent_pos in enumerate(self.agents_pos):
            x, y = agent_pos

            obs[i, 0] = agent_map[x:x+2*self.obs_radius+1, y:y+2*self.obs_radius+1]
            obs[i, 0, self.obs_radius, self.obs_radius] = 0
            obs[i, 1] = obstacle_map[x:x+2*self.obs_radius+1, y:y+2*self.obs_radius+1]
            obs[i, 2:] = self.heuri_map[i, :, x:x+2*self.obs_radius+1, y:y+2*self.obs_radius+1]

        # obs = np.concatenate((obs, self.last_actions), axis=1)

        return obs, np.copy(self.agents_pos)

Can I use your DHC/comm pth file?

Can I use your DHC/comm pth file? or Can I see your DHC/comm result in a table(not graph)?

I need it to compare my algorithm with yours.

thank you.

TypeError: can't pickle function objects

when i run train.py,i getTypeError: can't pickle function objects.how to fix it?

ziyuanma / dhc Goto Github PK

dhc's People

Contributors

Stargazers

Watchers

Forkers

dhc's Issues

Question of Worker.py

I have a question about scalability.

Unreachable agent targets might be generated.

The TD error calculation seems incorrect.

Cannot reproduce the performance of the trained model

A bug in environment.py?

Can I use your DHC/comm pth file?

TypeError: can't pickle function objects

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent