yining043 / vrp-dact Goto Github PK

This repo implements our paper, "Learning to Iteratively Solve Routing Problems with Dual-Aspect Collaborative Transformer", which has been accepted at NeurIPS 2021.

License: MIT License

Python 41.25% Jupyter Notebook 58.75%

reinforcement-learning vehicle-routing-problem neural-combinatorial-optimization transformer tsp vrp

vrp-dact's People

Contributors

Stargazers

Watchers

vrp-dact's Issues

Q: Details about Run time Comparison

Thanks for your novel idea and beautiful codes~
I'm a green hand currently doing literature review in this area and testing different solvers with the dataset given here. I found that solving instances of cvrp20 with LKH(3.0.7) is quite fast, i.e., 0.01s for each instance. However, the metric given in the paper "Learning to iteratively solve routing problems with dual-aspect collaborative transformer" is 1h, which is far away from my result.
Could you please give more detail about your work in the part Experiments. Thanks a lot!

How can i get the final tour from a case of TSP?

I want to draw a result graph for a TSP case, but I don't know how to output the specific tour.
if i want to get the tour under the pretrained model "tsp20-epoch-199.pt", how can i modify the run.py or other files?

About the time windows problem

Hi i want to ask how to add time windows to this code and how to generate it ? tks

How can i train my own data?

Hello there, thank you for your work. I am very interested in this, and want to train my data.
I'm a beginer in machine learning.
How can i train my own data, and what is the format of the data?
Looking forward to your reply.
想请教您，如何训练我自己的数据，并且数据的格式是什么要求。

Some questions about CPEs in codes

Hi, I have some questions about CPEs in codes（nets/graph_layers.py/EmbeddingNet/Cyclic_Positional_Encoding）

skip_base = np.power(n_position, 1 / (emb_dim // 2))
skip_set = np.linspace(skip_base, n_position, emb_dim // 2, dtype = 'int')

I assume that the skip_set is wavelength within the range $[N^{\frac{1}{d/2}}, N]$, and the skip is $ω_d$.
Then you iterate d times but only partition out d/2 units in the range. Could you explain the logic in the formulation?

for i in range(emb_dim):
            # see Appendix B
            skip = skip_set[i //3 * 3 + 1] if  (i //3 * 3 + 1) < (emb_dim // 2) else skip_set[-1]

In addition, Could you explain the notation in papers about fai in codes? I will appreciate it if you could reply me! Thanks :)

# see Appendix B
fai = 0 if i <= (emb_dim // 2) else  2 * np.pi * ((-i + (emb_dim // 2)) / (emb_dim // 2))

The inference time

Hi, it's me again ;)

When I run the following command to inference on TSP 50 problems (on a single TITAN RTX GPU card, 24G memory),

python run.py --graph_size 50 --eval_only --load_path pretrained/tsp50-epoch-198.pt --T_max 5000 \
              --val_size 10000 --val_dataset datasets/tsp_50_10000.pkl --val_m 1 --no_tb --no_saving --no_DDP

it requires 30 minutes to finish

 'val_dataset': 'datasets/tsp_50_10000.pkl',
 'val_m': 1,
 'val_size': 10000,
 'world_size': 1}
TSP with 50 nodes.  Do assert: False
{'Total': 280801, 'Trainable': 280801}
Distributed: False
 [*] Loading data from pretrained/tsp50-epoch-198.pt

Inference with x1 augments...
10000 instances initialized.
rollout:   1%|▏                   | 58/5000 [00:21<30:37,  2.69it/s]

(The rollout progress is smooth and there doesn't seem to exist any bottleneck.)

For TSP 100, I need to reduce the batch size, for example, to 4096. And the time for the rollout of a batch becomes 40 minutes.

According to the paper, they should be 6 and 18 minutes respectively (for one GPU and 512 batch size). As far as I can see, the GPU used for the paper is also TITAN RTX. Are there any other techniques used to speed up the inference?

How to understand the 'rec' variable in the implementation?

problem_tsp.py 中的231行： length = (d1 - d2).norm(p=2, dim=2).sum(1)，请问为什么没有计算最后一个节点返回原始点的距离

In line 231 of problem_tsp.py, length = (d1 - d2).norm(p=2, dim=2).sum(1)，why the edge from last node to the first node is not calculated?

-------SUMMARY------------
Hi all, I have updated the explanation related to this issue in the Readme, see 'Documentation'.

We also provide a Jupyter notebook to help you get started and understand our code. Please open the Jupyter notebook for more details.

About learning curves

Thanks for your paper and code.

When I train DACT (for TSP), the loss occasionally goes wild (although it doesn't affect the final result). I was wondering whether it is as expected? Or maybe I did something wrong?

I used max_grad_norm = 0.1 for TSP 100 (I have seen it is --max_grad_norm 0.2 for TSP 50 from the README)

Clarification needed on post and pre computation in code

Hello,

I've come across this segment of code and I'm trying to understand its intent and functionality:
argsort = solution.argsort() loc = x_in[:, :, :2] post = loc.gather(1, solution.view(bs, -1, 1).expand_as(loc)) pre = loc.gather(1, argsort.view(bs, -1, 1).expand_as(loc)) post = torch.norm(post - loc, 2, -1, True) pre = torch.norm(pre - loc, 2, -1, True)
What's post and pre means?
I try get it as a next point and last point of current point....
From the comments and variable names, it seems the goal is to compute the coordinates of the next (post) and previous (pre) nodes for each node based on the given solution. However, based on the behavior of loc.gather(...), it seems the current implementation does not achieve this. Instead, it looks like it's rearranging the nodes based on the solution order.

TSP cost function

Hello folks, first of all thanks for your work. I have a question/concern regarding your get_costs function, in which I don't clearly get the computation of the cost of the tour. d1 makes perfect sense, but d2 seems to be wrong, as it should be similar to d1 (but having the tour from d1 shifted by one position to the right). From that code (in both branches), I don't see how the results of ~8 for TSP100 are achieved, given that I believe that you don't compute the correct tour cost.

It would be great, if you could clarify this :)
Best regards

yining043 / vrp-dact Goto Github PK

vrp-dact's People

Contributors

Stargazers

Watchers

Forkers

vrp-dact's Issues

Q: Details about Run time Comparison

How can i get the final tour from a case of TSP?

About the time windows problem

How can i train my own data?

Some questions about CPEs in codes

The inference time

How to understand the 'rec' variable in the implementation?

About learning curves

Clarification needed on post and pre computation in code

TSP cost function

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent