[MLSys 2022] "BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling" by Cheng Wan, Youjie Li, Ang Li, Nam Sung Kim, Yingyan Lin
Dear author of BNS-GCN:
In the Table 9 of this paper, I have questions about the parameters of Epoch Time and Epoch Comm.
How to get the number of Epoch Comm? The experiment output has "memory stats", is it from this result?
In the output of each epoch, there are multiple epoch times of each processor. Is the final epoch time the average time or take the maximum value? (Same question as Epoch Comm)
Thank you very much!
Hello, I have encountered some problems when using bns-gcn, as can be seen from the code you provided, there are three model available for use. But after I modify the script specified by the script, there seems to be an error. I'm not sure where I have to modify if I want to use the other two models.
I have a question about the epoch time breakdown.
For the first time in the picture, is it the computation time or is it the total time of one training epoch?
I have already added --no_eval argument. But the time breakdown results are still strange and seem not following the results in the paper. That's why I ask this question.
Thanks!
thanks for your help. I try to partition the large dataset paper100m. However, it has errors with the log below. Could you help resolve that? I am sure there is enough memory (~1T). Not sure if it is because there aren't GPU for this node. (If so, could you please share the partition results from paper100m? say 40 partitions?)
May I ask how the steps outlined in the formula derivation on pages 14/15 of your article :"BNS-GCN: EFFICIENT FULL-GRAPH TRAINING OF GRAPH CONVOLUTIONAL NETWORKS WITH PARTITION-PARALLELISM AND RANDOM BOUNDARY NODE SAMPLING" were derived using rectangular frames