Giter VIP home page Giter VIP logo

graphsaint / graphsaint Goto Github PK

View Code? Open in Web Editor NEW
466.0 8.0 89.0 3.87 MB

[ICLR 2020; IPDPS 2019] Fast and accurate minibatch training for deep GNNs and large graphs (GraphSAINT: Graph Sampling Based Inductive Learning Method).

Home Page: https://openreview.net/forum?id=BJe8pkHFwS

License: MIT License

Python 60.56% C++ 26.70% C 0.78% Makefile 0.25% Cython 11.71%
gcn graph-sampling iclr graphsage jk-net gat ipdps

graphsaint's Introduction

GraphSAINT: Graph Sampling Based Inductive Learning Method

Hanqing Zeng*, Hongkuan Zhou*, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna

Contact

Hanqing Zeng ([email protected]), Hongkuan Zhou ([email protected])

Feel free to report bugs or tell us your suggestions!

Overview

GraphSAINT is a general and flexible framework for training GNNs on large graphs. GraphSAINT highlights a novel minibatch method specifically optimized for data with complex relationships (i.e., graphs). The traditional way of training a GNN is: 1). Construct a GNN on the full training graph; 2). For each minibatch, pick some nodes at the output layer as the root node. Backtrack the inter-layer connections from the root node until reaching the input layer; 3). Forward and backward propagation based on the loss on the roots. The way GraphSAINT trains a GNN is: 1). For each minibatch, sample a small subgraph from the full training graph; 2). Construct a complete GNN on the small subgraph. No sampling is performed within GNN layers; 3). Forward and backward propagation based on the loss on the subgraph nodes.

GraphSAINT training algorithm

GraphSAINT performs "graph sampling" based training, whereas others perform "layer sampling" based training. Why does it matter to change the perspective of sampling? GraphSAINT achieves the following:

Accuracy: We perform simple yet effective normalization to eliminate the bias introduced by graph sampling. In addition, since any sampling process incurs information loss due to dropped neighbors, we propose light-weight graph samplers to preserve important neighbors based on topological characteristics. In fact, graph sampling can also be understood as data augmentation or training regularization (e.g., we may see the edge sampling as a minibatch version of DropEdge).

Efficiency: While "neighbor explosion" is a headache for many layer sampling based methods, GraphSAINT provides a clean solution to it thanks to the graph sampling philosophy. As each GNN layer is complete and unsampled, the number of neighbors keeps constant no matter how deep we go. Computation cost per minibatch reduces from exponential to linear, w.r.t. GNN depth.

Flexibility: Layer propagation on a minibatch subgraph of GraphSAINT is almost identical to that on the full graph. Therefore, most GNN architectures designed for the full graph can be seamlessly trained by GraphSAINT. On the other hand, some layer sampling algorithms only support limited number of GNN architectures. Take JK-net as an example: the jumping knowledge connection requires node samples in shallower layers as a superset of node samplers in the deeper layers --- minibatches of FastGCN and AS-GCN do not satisfy such condition.

Scalability: GraphSAINT achieves scalability w.r.t. 1). graph size: our subgraph size does not need to grow proportionally with the training graphs size. So even if we are dealing with a million-node graph, the subgraphs can still easily fit in the GPU memory; 2). model size: by resolving "neighbor explosion", training cost scales linearly with GNN width and depth; and 3). amount of parallel resources: graph sampling is highly scalable by trivial task parallelism. In addition, resolving "neighbor explosion" also implies dramatic reduction in communication overhead, which is critical in distributed setting (see our IEEE/IPDPS '19 or hardware accelerator development).

[News]: Check out our new work that generalizes subgraph sampling to both the training and inference: shaDow-GNN (NeurIPS'21)!

About This Repo

This repo contains source code of our two papers (ICLR '20 and IEEE/IPDPS '19, see the Citation Section).

The ./graphsaint directory contains the Python implementation of the minibatch training algorithm in ICLR '20. We provide two implementations, one in Tensorflow and the other in PyTorch. The two versions follow the same algorithm. Note that all experiments in our paper are based on the Tensorflow implementation. New experiments on open graph benchmark are based on the PyTorch version.

The ./ipdps19_cpp directory contains the C++ implementation of the parallel training techniques described in IEEE/IPDPS '19 (see ./ipdps19_cpp/README.md). All the rest of this repository are for GraphSAINT in ICLR '20.

The GNN architectures supported by this repo:

GNN arch Tensorflow PyTorch C++
GraphSAGE ✔️ ✔️ ✔️
GAT ✔️ ✔️
JK-Net ✔️
GaAN ✔️
MixHop ✔️ ✔️

The graph samplers supported by this repo:

Sampler Tensorflow PyTorch C++
Node ✔️ ✔️
Edge ✔️ ✔️
RW ✔️ ✔️
MRW ✔️ ✔️ ✔️
Full graph ✔️ ✔️

where

  • RW: Random walk sampler
  • MRW: Multi-dimensional random walk sampler
  • Full graph: always returns the full training graph. Meant to be a baseline. No real "sampling" is going on.

You can add your own samplers and GNN layers easily. See the Customization section.

Results

New: We are testing GraphSAINT on Open Graph Benchmark. Currently, we have results for the ogbn-products graph. Note that the ogbn-products accuracy on the leaderboard trained with other methods are mostly under the transductive setting. Our results are under inductive learning (which is harder).

All results in ICLR '20 can be reproduced by running the config in ./train_config/. For example, ./train_config/table2/*.yml stores all the config for Table 2 of our paper. ./train_config/explore/*,yml stores all the config for deeper GNNs and various GNN architectures (GAT, JK, etc.). In addition, results related to OGB are trained by the config in ./train_config/open_graph_benchmark/*.yml.

Test set F1-mic score summarized below.

Sampler Depth GNN PPI PPI (large) Flickr Reddit Yelp Amazon ogbn-products
Node 2 SAGE 0.960 0.507 0.962 0.641 0.782
Edge 2 SAGE 0.981 0.510 0.966 0.653 0.807
RW 2 SAGE 0.981 0.941 0.511 0.966 0.653 0.815
MRW 2 SAGE 0.980 0.510 0.964 0.652 0.809
RW 5 SAGE 0.995
Edge 4 JK 0.970
RW 2 GAT 0.510 0.967 0.652 0.815
RW 2 GaAN 0.508 0.968 0.651
RW 2 MixHop 0.967
Edge 3 GAT 0.8027

Dependencies

  • python >= 3.6.8
  • tensorflow >=1.12.0 / pytorch >= 1.1.0
  • cython >=0.29.2
  • numpy >= 1.14.3
  • scipy >= 1.1.0
  • scikit-learn >= 0.19.1
  • pyyaml >= 3.12
  • g++ >= 5.4.0
  • openmp >= 4.0

Datasets

All datasets used in our papers are available for download:

  • PPI
  • PPI-large (a larger version of PPI)
  • Reddit
  • Flickr
  • Yelp
  • Amazon
  • ogbn-products
  • ... (more to be added)

They are available on Google Drive link (alternatively, BaiduYun link (code: f1ao)). Rename the folder to data at the root directory. The directory structure should be as below:

GraphSAINT/
│   README.md
│   run_graphsaint.sh
│   ...
│
└───graphsaint/
│   │   globals.py
│   │   cython_sampler.pyx
│   │   ...
│   │
│   └───tensorflow_version/
│   │   │    train.py
│   │   │    model.py
│   │   │    ...
│   │
│   └───pytorch_version/
│       │    train.py
│       │    model.py
│       │    ...
│
└───data/
│   └───ppi/
│   │   │    adj_train.npz
│   │   │    adj_full.npz
│   │   │    ...
│   │
│   └───reddit/
│   │   │    ...
│   │
│   └───...
│

We also have a script that converts datasets from our format to GraphSAGE format. To run the script,

python convert.py <dataset name>

For example python convert.py ppi will convert dataset PPI and save new data in GraphSAGE format to ./data.ignore/ppi/

New: For data conversion from the OGB format to the GraphSAINT format, please use the script ./data/open_graph_benchmark/ogbn_converter.py. Currently, this script can handle ogbn-products and ogbn-arxiv.

Cython Implemented Parallel Graph Sampler

We have a cython module which need compilation before training can start. Compile the module by running the following from the root directory:

python graphsaint/setup.py build_ext --inplace

Training Configuration

The hyperparameters needed in training can be set via the configuration file: ./train_config/<name>.yml.

The configuration files to reproduce the Table 2 results are packed in ./train_config/table2/.

For detailed description of the configuration file format, please see ./train_config/README.md

Run Training

First of all, please compile cython samplers (see above).

We suggest looking through the available command line arguments defined in ./graphsaint/globals.py (shared by both the Tensorflow and PyTorch versions). By properly setting the flags, you can maximize CPU utilization in the sampling step (by telling the number of available cores), select the directory to place log files, and turn on / off loggers (Tensorboard, Timeline, ...), etc.

NOTE: For all methods compared in the paper (GraphSAINT, GCN, GraphSAGE, FastGCN, S-GCN, AS-GCN, ClusterGCN), sampling or clustering is only performed during training. To obtain the validation / test set accuracy, we run the full batch GNN on the full graph (training + validation + test nodes), and calculate F1 score only for the validation / test nodes. See also issue #11.

For simplicity of implementation, during validation / test set evaluation, we perform layer propagation using the full graph adjacency matrix. For Amazon or Yelp, this may cause memory issue for some GPUs. If an out-of-memory error occurs, please use the --cpu_eval flag to force the val / test set evaluation to take place on CPU (the minibatch training will still be performed on GPU). See below for other Flags.

To run the code on CPU

python -m graphsaint.<tensorflow/pytorch>_version.train --data_prefix ./data/<dataset_name> --train_config <path to train_config yml> --gpu -1

To run the code on GPU

python -m graphsaint.<tensorflow/pytorch>_version.train --data_prefix ./data/<dataset_name> --train_config <path to train_config yml> --gpu <GPU number>

For example --gpu 0 will run on the first GPU. Also, use --gpu <GPU number> --cpu_eval to make GPU perform the minibatch training and CPU to perform the validation / test evaluation.

We have also implemented dual-GPU training to further speedup runtime. Simply add the flag --dualGPU and assign two GPUs using the --gpu flag. Currently this only works for GPUs supporting memory pooling and connected by NvLink.

New: we have prepared specific scripts to train OGB graphs. See ./graphsaint/open_graph_benchmark/ for the scripts and instructions.

Customization

Below we describe how to customize this code base for your own research / product.

How to Prepare Your Own Dataset?

Suppose your full graph contains N nodes. Each node has C classes, and length-F initial attribute vector. If your train/val/test split is a/b/c (i.e., a+b+c=1), then:

adj_full.npz: a sparse matrix in CSR format, stored as a scipy.sparse.csr_matrix. The shape is N by N. Non-zeros in the matrix correspond to all the edges in the full graph. It doesn't matter if the two nodes connected by an edge are training, validation or test nodes. For unweighted graph, the non-zeros are all 1.

adj_train.npz: a sparse matrix in CSR format, stored as a scipy.sparse.csr_matrix. The shape is also N by N. However, non-zeros in the matrix only correspond to edges connecting two training nodes. The graph sampler only picks nodes/edges from this adj_train, not adj_full. Therefore, neither the attribute information nor the structural information are revealed during training. Also, note that only aN rows and cols of adj_train contains non-zeros. See also issue #11. For unweighted graph, the non-zeros are all 1.

role.json: a dictionary of three keys. Key 'tr' corresponds to the list of all training node indices. Key va corresponds to the list of all validation node indices. Key te corresponds to the list of all test node indices. Note that in the raw data, nodes may have string-type ID. You would need to re-assign numerical ID (0 to N-1) to the nodes, so that you can index into the matrices of adj, features and class labels.

class_map.json: a dictionary of length N. Each key is a node index, and each value is either a length C binary list (for multi-class classification) or an integer scalar (0 to C-1, for single-class classification).

feats.npy: a numpy array of shape N by F. Row i corresponds to the attribute vector of node i.

How to Add Your Own Sampler?

All samplers are implemented as subclass of GraphSampler in ./graphsaint/graph_samplers.py. There are two ways to implement your sampler subclass:

  1. Implement in pure python. Overwrite the par_sample function of the super-class. We provide a basic example in the NodeSamplingVanillaPython class of ./graphsaint/graph_samplers.py.
    • Pros: Easy to implement
    • Cons: May have slow execution speed. It is non-trivial to parallelize a pure python function.
  2. Implement in cython. You need to add a subclass of the Sampler in ./graphsaint/cython_sampler.pyx. In the subclass, you only need to overwrite the __cinit__ and sample functions. The sample function defines the sequential behavior of the sampler. We automatically perform task-level parallelism by launching multiple samplers at the same time.
    • Pros: Fits in the parallel-execution framework. C++ level execution speed.
    • Cons: Hard to code

How to Support Your Own GNN Layer?

Add a layer in ./graphsaint/<tensorflow or pytorch>_version/layers.py. You would also need to do some minor update to __init__ function of the GraphSAINT class in ./graphsaint/<tensorflow or pytorch>_version/models.py, so that the model knows how to lookup the correct class based on the keyword in the yml config.

Citation & Acknowledgement

Supported by DARPA under FA8750-17-C-0086, NSF under CCF-1919289 and OAC-1911229.

We thank Matthias Fey for providing a reference implementation in the PyTorch Geometric library.

We thank the OGB team for using GraphSAINT on large scale experiments.

  • ICLR 2020:
@inproceedings{graphsaint-iclr20,
title={{GraphSAINT}: Graph Sampling Based Inductive Learning Method},
author={Hanqing Zeng and Hongkuan Zhou and Ajitesh Srivastava and Rajgopal Kannan and Viktor Prasanna},
booktitle={International Conference on Learning Representations},
year={2020},
url={https://openreview.net/forum?id=BJe8pkHFwS}
}
  • IEEE/IPDPS 2019:
@INPROCEEDINGS{graphsaint-ipdps19,
author={Hanqing Zeng and Hongkuan Zhou and Ajitesh Srivastava and Rajgopal Kannan and Viktor Prasanna},
booktitle={2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
title={Accurate, Efficient and Scalable Graph Embedding},
year={2019},
month={May},
}

graphsaint's People

Contributors

fandreuz avatar graphsaint avatar tedzhouhk avatar zimplex avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphsaint's Issues

confused about some details in the paper

Dear professor,
recently I was learning the paper of graphSaint, but I was confused about minibatch. While I refer it in your codes, I found your minibatch is divided by nodes in order!? And I wonder to know the differences between minibatch and subgraph. Do they represent the same thing ?Looking forward to your reply. Thanks so much!!
`class Minibatch:
"""
This minibatch iterator iterates over nodes for supervised learning.
"""

def __init__(self, adj_full, adj_full_norm, adj_train, role, class_arr, placeholders, train_params, **kwargs):
    """
    role:       array of string (length |V|)
                storing role of the node ('tr'/'va'/'te')
    class_arr: array of float (shape |V|xf)
                storing initial feature vectors
    """
    self.num_proc = 1
    self.node_train = np.array(role['tr'])
    self.node_val = np.array(role['va'])
    self.node_test = np.array(role['te'])

    self.class_arr = class_arr
    self.adj_full = adj_full
    self.adj_full_norm = adj_full_norm
    s1=int(adj_full_norm.shape[0]/8*1)
    s2=int(adj_full_norm.shape[0]/8*2)
    s3=int(adj_full_norm.shape[0]/8*3)
    s4=int(adj_full_norm.shape[0]/8*4)
    s5=int(adj_full_norm.shape[0]/8*5)
    s6=int(adj_full_norm.shape[0]/8*6)
    s7=int(adj_full_norm.shape[0]/8*7)
    self.dim0_adj_sub = adj_full_norm.shape[0]/8
    self.adj_full_norm_0=adj_full_norm[:s1,:]
    self.adj_full_norm_1=adj_full_norm[s1:s2,:]
    self.adj_full_norm_2=adj_full_norm[s2:s3,:]
    self.adj_full_norm_3=adj_full_norm[s3:s4,:]
    self.adj_full_norm_4=adj_full_norm[s4:s5,:]
    self.adj_full_norm_5=adj_full_norm[s5:s6,:]
    self.adj_full_norm_6=adj_full_norm[s6:s7,:]
    self.adj_full_norm_7=adj_full_norm[s7:,:]
    self.adj_train = adj_train`

No Cuda GPUs availables

Hi,
I am trying to implement sample subgraph for recommendation with knowledge graph using GraphSAINT.
I write a print code print(torch.cuda.is_available()) in init() function of Minibatch class.
It printed True when I ran train.py but it printed False when I ran from Minibatch.
I guess the problem here from from graphsaint.graph_samplers import *.
Can you clarify it for me?

some question about it

Hello, thank you for your work.
The idea in the article is to sample the a big graph into a sub-graph. I segmented the brain. Each subject has a graph, so there are multiple graphs, and each brain area corresponds to a different label and color. How to deal with this? which code should I mainly look at?
If I have 100 pictures, I downsample it to get 100 sub-graph?

Questions about alpha and lambda

Hello,
I'm confused about the approximate calculation of alpha and lambda in paper. Why can we set $\alpha_{u,v} = C_{u,v} / C_v$ and $\lambda_v = C_v / N$?
Thanks.

Some question about Reddit dataset

Hi, congratulations to you, it seems like a good score in ICLR open-review.
Here I have some questions about reddit dataset, when I print the "role.json" of reddit,
I find the train/val/test: 151701/23699/55334, however in cluster-gcn paper,
it reports the train/val/test: 153932/23699/55334.
I wonder how the difference between this two paper in split size on reddit?

Question about adj_full_norm

Dear authors, thanks for publishing this great repo!

I have a small question:

In the pytorch version code,the full adj is normalized and used to generate the minibatch. However, in an inductive setting, one may only see the training data. Is it more appropriate to normalize the train/val/test nodes separately, or I misunderstand the code here? Looking forward to your reply! Thank you very much!

Code to make .bin files

Hi,
Do you have code to convert data to the *.bin format used in the ipdps19_cpp folder? The README.md in that folder says to use the convert.py script, but that script does not generate those files. Thanks.
jgw

About Flickr Dataset

Might be a stupid question but it seems to be that the Flickr Dataset contains a single graph where you have train/val/test masks covering different nodes. How do you ensure this is the inductive learning setting where during training the sampled neighbours does not contain nodes from val/test splits?

purpose of minibatch.shuffle() in training loop

Hi,
In the training loop, I notice there is a minibatch.shuffle() at the beginning of each epoch. Upon inspecting minibatch.py, I observe that this function performs np.random.permutation(self.node_train). Actually, it is not clear to me what this shuffling is doing and what is its benefit. Could you please clarify?

ogbn-products submission

Hi, thanks for your leaderboard submission to OGB.
I could not find something like from ogb.nodeproppred import PygNodePropPredDataset. How did you obtain the dataset? Also, please put add README.md at https://github.com/GraphSAINT/GraphSAINT/tree/master/graphsaint/pytorch_version to give details on how to reproduce your results.

Edit: I found the dataset importing at ogb_converter.py. In README.md, could you describe the step-by-step process of how you can obtain the results?

Thanks!!

Error in tensorflow version when running ogbn-products

Hi, I have encountered an error when I try to run the tensorflow implementation on ogbn-products. Please see below.

Environment:

conda create -n graphsaint_1.15_env
conda activate graphsaint_1.15_env
conda install \
      cython==0.29.21 \
      pyyaml==5.3.1 \
      scikit-learn==0.23.2 \
      tensorflow==1.15.0
python graphsaint/setup.py build_ext --inplace

Run command:

python -m graphsaint.tensorflow_version.train --data_prefix /srv/scratch/ogb/datasets/cb/nodeproppred/ogbn_products/GraphSAINT --train_config ./train_config/open_graph_benchmark/ogbn-products_3_e_gat.yml --gpu -1

Stack trace:

Traceback (most recent call last):
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] ndims must be >= 2: 1
	 [[{{node MatMul}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/srv/scratch/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 243, in <module>
    tf.app.run(main=train_main)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/srv/scratch/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 239, in train_main
    ret = train(train_phases,model,minibatch,sess,train_stat,ph_misc_stat,summary_writer)
  File "/srv/scratch/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 167, in train
    options=tf.RunOptions(report_tensor_allocations_upon_oom=True))
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/srv/scratch/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] ndims must be >= 2: 1
	 [[node MatMul (defined at /packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py:1748) ]]

Original stack trace for 'MatMul':
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 243, in <module>
    tf.app.run(main=train_main)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 238, in train_main
    model,minibatch,sess,train_stat,ph_misc_stat,summary_writer = prepare(train_data,train_params,arch_gcn)
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/train.py", line 98, in prepare
    feats, arch_gcn, train_params, adj_full_norm, logging=True)
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/model.py", line 66, in __init__
    self.build()
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/model.py", line 104, in build
    self._loss()
  File "/jgwohlbier/GraphSAINT/graphsaint/tensorflow_version/model.py", line 133, in _loss
    tf.reshape(self._weight_loss_batch,(-1,1)))
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/ops/math_ops.py", line 2716, in matmul
    return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 1712, in batch_mat_mul_v2
    "BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "/packages/spack/opt/spack/linux-rhel8-skylake_avx512/gcc-8.3.1/anaconda3-2020.07-weugqkfkxd6zmn2irm7lpmujzczwebiw/envs/graphsaint_1.15_env/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

Why inductive?

Hello, thanks for sharing your code! In readme.md, you said "For simplicity of implementation, during validation / test set evaluation, we perform layer propagation using the full graph adjacency matrix.", So when a new node come, all the node embeddings need to be calculated, but for example, in GraphSAGE, when a new node come, not all embeddings but some nodes are involved which I think is what "inductive" truely means. In other words, I suppose a "inductive" graph learning method should:1. can be trained without val and test nodes and links;2. can generate node embedding for a new comming node very fast. It seems to me that graphSAINT can satisfy 1 but can not satisfy 2. Is that correct? Thanks!

Performance on citation network

Hi authors, thanks for making the code public. I noticed most of the GNNs have evaluated the performance on citation networks, such as Cora, Citeseer and Pubmed. I also have great interests in the results on Cora, Citeseer and Pubmed. Have you test GraphSAINT on these datasets? Could you report the corresponding results if convenient? Thanks very much!

two issues

hello,i run your model on ppi and flickr.it seems that have some problems.i hope you can alter the code as a elegant style.
my os is win10.
saveing model:
i use timestamp as file name will throw an OSError,i use a nomal string and training succeed.
#---------------------
Saving model ...
Traceback (most recent call last):
File "D:\my soft\python374\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "D:\my soft\python374\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "G:\GraphSAINT-master\graphsaint\pytorch_version\train.py", line 101, in
train(train_phases, model, minibatch, minibatch_eval, model_eval)
File "G:\GraphSAINT-master\graphsaint\pytorch_version\train.py", line 80, in train
torch.save(model.state_dict(), path_saver)
File "D:\my soft\python374\lib\site-packages\torch\serialization.py", line 327, in save
with _open_file_like(f, 'wb') as opened_file:
File "D:\my soft\python374\lib\site-packages\torch\serialization.py", line 212, in _open_file_like
return _open_file(name_or_buffer, mode)
File "D:\my soft\python374\lib\site-packages\torch\serialization.py", line 193, in init
super(_open_file, self).init(open(name, mode))
OSError: [Errno 22] Invalid argument: './pytorch_models/saved_model_2020-04-12 14:35:13.pkl'

traget type:
i add "labels = labels.long()" before ”models.py,line 82”,but i not sure that it will not cause other problem run on other dataset.
#---------------------
Traceback (most recent call last):
File "D:\my soft\python374\lib\runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "D:\my soft\python374\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "G:\GraphSAINT-master\graphsaint\pytorch_version\train.py", line 101, in
train(train_phases, model, minibatch, minibatch_eval, model_eval)
File "G:\GraphSAINT-master\graphsaint\pytorch_version\train.py", line 72, in train
loss_val, f1mic_val, f1mac_val = evaluate_full_batch(model_eval, minibatch_eval, mode='val')
File "G:\GraphSAINT-master\graphsaint\pytorch_version\train.py", line 18, in evaluate_full_batch
loss,preds,labels = model.eval_step(*minibatch.one_batch(mode=mode))
File "G:\GraphSAINT-master\graphsaint\pytorch_version\models.py", line 123, in eval_step
loss = self._loss(preds,labels_converted,norm_loss_subgraph)
File "G:\GraphSAINT-master\graphsaint\pytorch_version\models.py", line 82, in _loss
_ls = torch.nn.CrossEntropyLoss(reduction='none')(preds, labels)
File "D:\my soft\python374\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "D:\my soft\python374\lib\site-packages\torch\nn\modules\loss.py", line 916, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "D:\my soft\python374\lib\site-packages\torch\nn\functional.py", line 2021, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "D:\my soft\python374\lib\site-packages\torch\nn\functional.py", line 1838, in nll_loss
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: Expected object of scalar type Long but got scalar type Int for argument #2 'target' in call to _thnn_nll_loss_forward

thank you

How to do inference for unseen nodes (Inductive inference)?

Hi, I am looking for a sample code of a node classification task that does inference for an unseen node X. X doesn't exist at training time.
the unseen node X will be connected with a subset of seen nodes and another set of unseen nodes as well at the inference time.
I can not find a sample code for that in your provided code.

Dataset Issue

Hello, I try to download these datasets from google drive but it is too slow, is there another way to access the datasets?

Some problem about Amazon dataset

Sorry to disturb you. When I download the Amazon dataset form baiduyun, I find that some nodes have no label, which means in class_map.json, some nodes' labels are all 0 in 107 dim, and the nodes num is not same with the node number described in the paper, can you check it?

how to generate my own dataset

The idea is good, while I want to test the algorithm on transductive dataset, such as 'cora', can you give a generator script

GraphSaint arch: '0-0-0-0' vs MLP sanity check

Hi there,

If my understanding is correct, setting arch: '0-0-0-0' in the train_config would be equivalent to an MLP.

I trained a GraphSAINT model with this architecture and compared it to an MLP as a sanity check and got much worse performance for the GraphSAINT model:
image

The architectures were the same for MLP and GraphSAINT model (same layer dims, activation functions, dropout, learning rate, etc.).

I expect to see the same performance here. Any idea what the issue might be?

as a side note: I also noticed that when I do an architecture search (with neighbor aggregations), I can get comparable but slightly worse result than an MLP. Given this information with the above, it means that the neighbor information helps the model, but still the GraphSaint "mlp" baseline is very low for some reason.

The Selection of Hyperparameters

Hello, I would like to ask how to select hyperparameters such as the number of neighbors and batchsize of Graphsage when comparing them to ensure that they have basically the same training conditions.

some question about sampling

Hey bro, it;s me again ! You good?
Recently I was reading some paper about graph sampling. And I remember your team has used excellent graph sampling methods in this paper! So I want to matter your team to help me again!
1 The most important issue is that how to use random walk based samplers. Because the target of using method of random walk is to obtain several sequences made up of vertices which could preserve the properties of the original graph. But how could you form those sequences to into a subgtaph? just focus on some special vertices and use random walk repreatly on them to construct a subgraph?
2 What do you think of the effect of regular random walk and multi-dimensional random walk sampler defined in Ribeiro&Towsley(2010).In terms of experimental results, This two the performance of the two models is similar.
3 If I want to use your sampling result, but I prefer to obtain several random walk sequences rather them the complete subgraph. Could you give me some advice?
Thank you so much !! And sorry to bother you again!

question about the computation of loss_norm

according to the paper, loss_norm(v) = C(v) / N, where N is the number of subgraphs and C(v) is the appearance times of v in all subgraphs. While In the pytorch implementation I found the actual computation is different

self.norm_loss_train[self.node_train] = num_subg / self.norm_loss_train[self.node_train] / self.node_train.size

I know for convenience we calculate the reciprocal instead, but the above two still show conflict(specifically, in the code you added another self.node_train.size), is there something that I misunderstanding? Thank you!

hyper parameter for baselines

Hi, is it possible for you to share the final hyper parameter (learning rate, dropout) of the baselines? Thank you!

question on inter-subgraph edges in node and RW sampling

Hi,
Can I please ask a question about the graphSAINT mechanism to clarify my understanding?
Node and RW sampling are techniques for constructing a mini-batch consisting of a specific number of nodes. These chosen/sampled nodes in the current subgraph will have edges between them. But does GraphSAINT take into account the edges between a sampled node in the current subgraph and some other node which has not been sampled in the current subgraph? For example, A-B may be an edge in the actual full graph. Let's say A in sampled in the current subgraph but B is not. Does graphSAINT do anything about this edge in the subgraph training where A is present? If not, could you please explain why it is OK to ignore these inter-subgraph edges? Thanks!

About graph construction

I wonder if the code for the construction of the original graph dataset could be provided for reference? I would like to learn how to use textual information for constructing such text attribute graphs, thanks.

What is adj_train and how to create it?

I wanted to try GraphSAINT on my data and could create specified files for my dataset except for adj_train.
At first I thought it's a graph with only train nodes but it contains all the node in adj_full. Then I tried to use adj_full as adj_train, it only works for node sampler and for other sampler I get this error:
File "/content/GraphSAINT/graphsaint/minibatch.py", line 131, in set_sampler assert self.norm_loss_train[self.node_val].sum() + self.norm_loss_train[self.node_test].sum() == 0 AssertionError
Is it the full graph without edges of test nodes? How can I create it from the full graph?

Problems of cython samplers compling

When we compile the cython sampler, at the root path of GraphSAINT. On my Macbook and Ubuntu server, this process cannot be completely done. When running the sh file, python reports the missing of the graphsaint.cython_sampler. I want to make sure that what we should do to run the code.
Detail errors are listed below:
Mac Os:
$ python graphsaint/setup.py build_ext --inplace
Compiling graphsaint/cython_sampler.pyx because it changed.
Compiling graphsaint/cython_utils.pyx because it changed.
Compiling graphsaint/norm_aggr.pyx because it changed.
[1/3] Cythonizing graphsaint/cython_sampler.pyx
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
[2/3] Cythonizing graphsaint/cython_utils.pyx
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
[3/3] Cythonizing graphsaint/norm_aggr.pyx
warning: graphsaint/norm_aggr.pyx:24:17: Use boundscheck(False) for faster access
warning: graphsaint/norm_aggr.pyx:24:35: Use boundscheck(False) for faster access
warning: graphsaint/norm_aggr.pyx:24:51: Use boundscheck(False) for faster access
running build_ext
building 'graphsaint.cython_sampler' extension
creating build
creating build/temp.macosx-10.7-x86_64-3.6
creating build/temp.macosx-10.7-x86_64-3.6/graphsaint
g++ -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/anaconda3/include -arch x86_64 -I/anaconda3/include -arch x86_64 -I/anaconda3/lib/python3.6/site-packages/numpy/core/include -I/anaconda3/include/python3.6m -c graphsaint/cython_sampler.cpp -o build/temp.macosx-10.7-x86_64-3.6/graphsaint/cython_sampler.o -fopenmp -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from /anaconda3/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/syslimits.h:7:0,
from /anaconda3/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:34,
from /anaconda3/include/python3.6m/Python.h:11,
from graphsaint/cython_sampler.cpp:25:
/anaconda3/lib/gcc/x86_64-apple-darwin11.4.2/4.8.5/include-fixed/limits.h:168:61: fatal error: limits.h: No such file or directory
#include_next <limits.h> /* recurse down to the real one */
^
compilation terminated.

Ubuntu:
python graphsaint/setup.py build_ext --inplace
Compiling graphsaint/cython_sampler.pyx because it changed.
Compiling graphsaint/cython_utils.pyx because it changed.
Compiling graphsaint/norm_aggr.pyx because it changed.
[1/3] Cythonizing graphsaint/cython_sampler.pyx
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
[2/3] Cythonizing graphsaint/cython_utils.pyx
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:28:29: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
warning: graphsaint/cython_utils.pxd:33:31: Buffer unpacking not optimized away.
[3/3] Cythonizing graphsaint/norm_aggr.pyx
warning: graphsaint/norm_aggr.pyx:24:51: Use boundscheck(False) for faster access
warning: graphsaint/norm_aggr.pyx:24:35: Use boundscheck(False) for faster access
warning: graphsaint/norm_aggr.pyx:24:17: Use boundscheck(False) for faster access
running build_ext
building 'graphsaint.cython_sampler' extension
creating build
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/graphsaint
g++ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/chen/anaconda3/lib/python3.7/site-packages/numpy/core/include -I/home/chen/anaconda3/include/python3.7m -c graphsaint/cython_sampler.cpp -o build/temp.linux-x86_64-3.7/graphsaint/cython_sampler.o -fopenmp -std=c++11
unable to execute 'g++': No such file or directory
error: command 'g++' failed with exit status 1

Error: Cannot use GPU when output.shape[1] * nnz(a) > 2^31

When I run
python -m graphsaint.tensorflow_version.train --data_prefix=./data/ogbn-product/ --train_config=./train_config/open_graph_benchmark/ogbn-products_3_e_gat.yml --gpu=1

I get the following error:
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/raid/graphs/GraphSAINT/graphsaint/utils.py:112: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
train_config = yaml.load(f_train_config)
Loading training data..
Done loading training data..
/raid/graphs/GraphSAINT/graphsaint/utils.py:190: RuntimeWarning: divide by zero encountered in true_divide
norm_diag = sp.dia_matrix((1/D,0),shape=diag_shape)

layer attentionaggregator_1, dim: [100,256]
layer attentionaggregator_2, dim: [256,256]
layer attentionaggregator_3, dim: [256,256]
layer highorderaggregator_1, dim: [256,47]
WARNING:tensorflow:From /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py:127: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-10-21 14:37:55.577976: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-10-21 14:37:56.899292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:07:00.0
totalMemory: 31.72GiB freeMemory: 31.41GiB
2020-10-21 14:37:56.899353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-10-21 14:37:57.503937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-21 14:37:57.503971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2020-10-21 14:37:57.503980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2020-10-21 14:37:57.526776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30475 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:07:00.0, compute capability: 7.0)
/raid/graphs/GraphSAINT/graphsaint/graph_samplers.py:168: RuntimeWarning: divide by zero encountered in true_divide
self.adj_train_norm = scipy.sparse.dia_matrix((1 / self.deg_train, 0), shape=adj_train.shape).dot(adj_train)
sampling 200 subgraphs: time = 3.898 sec
START PHASE 0
Epoch 0
Traceback (most recent call last):
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[{{node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul}} = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 244, in
tf.app.run(main=train_main)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 239, in train_main
ret = train(train_phases,model,minibatch,sess,train_stat,ph_misc_stat,summary_writer)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 192, in train
evaluate_full_batch(sess_eval,model,minibatch,many_runs_timeline,mode='val')
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 53, in evaluate_full_batch
preds,loss = sess.run([model.preds, model.loss], feed_dict=feed_dict)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul (defined at /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py:282) = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

Caused by op 'attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul', defined at:
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 244, in
tf.app.run(main=train_main)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 238, in train_main
model,minibatch,sess,train_stat,ph_misc_stat,summary_writer = prepare(train_data,train_params,arch_gcn)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 98, in prepare
feats, arch_gcn, train_params, adj_full_norm, logging=True)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 66, in init
self.build()
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 89, in build
_outputs_l = self.aggregate_subgraph()
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 165, in aggregate_subgraph
hidden = self.aggregatorslayer
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py", line 61, in call
outputs = self._call(inputs)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py", line 282, in _call
ret_neigh_i = self.act(tf.sparse_tensor_dense_matmul(adj_weighted,vw_neigh[i])
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/sparse_ops.py", line 2004, in sparse_tensor_dense_matmul
adjoint_b=adjoint_b)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/gen_sparse_ops.py", line 2789, in sparse_tensor_dense_mat_mul
name=name)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul (defined at /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py:282) = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

(graphs) ray@node9:/raid/graphs/GraphSAINT$ python -m graphsaint.tensorflow_version.train --data_prefix=./data/ogbn-product/ --train_config=./train_config/open_graph_benchmark/ogbn-products_3_e_gat.yml --gpu=1
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
/raid/graphs/GraphSAINT/graphsaint/utils.py:112: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
train_config = yaml.load(f_train_config)
Loading training data..
Done loading training data..
/raid/graphs/GraphSAINT/graphsaint/utils.py:190: RuntimeWarning: divide by zero encountered in true_divide
norm_diag = sp.dia_matrix((1/D,0),shape=diag_shape)

layer attentionaggregator_1, dim: [100,256]
layer attentionaggregator_2, dim: [256,256]
layer attentionaggregator_3, dim: [256,256]
layer highorderaggregator_1, dim: [256,47]
WARNING:tensorflow:From /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py:127: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-10-21 14:45:08.221100: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2020-10-21 14:45:09.020378: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:07:00.0
totalMemory: 31.72GiB freeMemory: 31.41GiB
2020-10-21 14:45:09.020423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2020-10-21 14:45:09.645754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-21 14:45:09.645805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2020-10-21 14:45:09.645813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2020-10-21 14:45:09.646048: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30475 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:07:00.0, compute capability: 7.0)
/raid/graphs/GraphSAINT/graphsaint/graph_samplers.py:168: RuntimeWarning: divide by zero encountered in true_divide
self.adj_train_norm = scipy.sparse.dia_matrix((1 / self.deg_train, 0), shape=adj_train.shape).dot(adj_train)
sampling 200 subgraphs: time = 3.222 sec
START PHASE 0
Epoch 0
Traceback (most recent call last):
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[{{node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul}} = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 244, in
tf.app.run(main=train_main)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 239, in train_main
ret = train(train_phases,model,minibatch,sess,train_stat,ph_misc_stat,summary_writer)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 192, in train
evaluate_full_batch(sess_eval,model,minibatch,many_runs_timeline,mode='val')
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 53, in evaluate_full_batch
preds,loss = sess.run([model.preds, model.loss], feed_dict=feed_dict)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul (defined at /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py:282) = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

Caused by op 'attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul', defined at:
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 244, in
tf.app.run(main=train_main)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 238, in train_main
model,minibatch,sess,train_stat,ph_misc_stat,summary_writer = prepare(train_data,train_params,arch_gcn)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/train.py", line 98, in prepare
feats, arch_gcn, train_params, adj_full_norm, logging=True)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 66, in init
self.build()
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 89, in build
_outputs_l = self.aggregate_subgraph()
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/model.py", line 165, in aggregate_subgraph
hidden = self.aggregatorslayer
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py", line 61, in call
outputs = self._call(inputs)
File "/raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py", line 282, in _call
ret_neigh_i = self.act(tf.sparse_tensor_dense_matmul(adj_weighted,vw_neigh[i])
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/sparse_ops.py", line 2004, in sparse_tensor_dense_matmul
adjoint_b=adjoint_b)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/ops/gen_sparse_ops.py", line 2789, in sparse_tensor_dense_mat_mul
name=name)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/ray/miniconda3/envs/graphs/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node attentionaggregator_1/SparseTensorDenseMatMul_4/SparseTensorDenseMatMul (defined at /raid/graphs/GraphSAINT/graphsaint/tensorflow_version/layers.py:282) = SparseTensorDenseMatMul[T=DT_FLOAT, Tindices=DT_INT64, adjoint_a=false, adjoint_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](_arg_adj_subgraph_0/indices_0_3/_355, attentionaggregator_1/mul_12, _arg_adj_subgraph_0/shape_0_4/_413, attentionaggregator_1/MatMul_1)]]

Bad performance of Vanilla GCN

Hi! I observed in your Amazon Dataset the performance of Vanilla GCN and FastGCN is poor compared to other models, and I didn't find relevant explanation in your paper. Also, you mentioned in the Appendix that you set the Batch Size of vanilla GCN, does that mean you use the gcn-aggregator
of graphsage?

I am trying to understand why does gcn model perform so poor on large graph, can the reason be attributed to SGD/BGD has better generalization ability compared to full graph gradient decent? Could you help me with that?

Question about aggregation in backprop

Hello,
I notice that in the backprop, the GCN layer does two aggregations:

sparseMM(subg_trans, grad_in_part2, feat_aggr, num_thread); // use feat_aggr as temp var

sparseMM(subg_trans, grad_in_part2, feat_aggr, num_thread);

this is in the case that
if (dim_weight_in >= dim_weight_out/2) and
if (id_layer > 0)

so for the second GCN layer (id=1), since dim_in and dim_out are both dim_hid, so dim_in > dim_out, in this case, the sparseMM is computed twice, which is not necessary to my understanding. Am I missing anything?

Meanwhile, in the case of (dim_weight_in < dim_weight_out/2) and (id_layer > 0), line 137 to 139 don't compute the gredients for line 97&98 in the forward phase.

Thank you!

Best,
Xuhao

Test set F1-micro score of Cluster-GCN on Reddit

Hi, sorry to disturb you.
I have questions about f1 score of Reddit dataset.
I notice in your paper, in table 3, f1 score of cluster-gcn 4x128 is 99.66,
However, I can not reproduce experimental accuracy from Open-sourced implementation in google-research, in my experiment, the best micro-F1 score of cluster-gcn is 99.63.
I wonder how you achieve 99.66 using cluster-gcn in Reddit ?

How can I get the output embedding?

Hi, thank you for your meaningful work. But I got a problem after I running the code, the only result I can get is the evaluation score of this GraphSAINT model. However, I want to have the embedding result. What should I do? Or is that possible? Thank you :)

Dataset Amazon and the description do not match.

As the title, we found the batch size of Graphsage has great influence over the performance on Amazon dataset as the Table 11 shows in your paper. However, we found the dataset Amazon we download from the Google Driver has 1,569,960 nodes and 96,295 nodes have no labels, which is incompatible with the description in your paper for the dataset.

  • Is there any extra process to deal with the Amazon Dataset?
  • And can you kindly provide any explanations about the performance trend ( increase then decrease when batch size increase for GraphSage over Amazon)?

Can not run the code

Hi, when I follow your advice in readme, some errors happen as follow:
Traceback (most recent call last):
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/wangjialin/Community-Detection/GraphSAINT/graphsaint/train.py", line 1, in
from graphsaint.globals import *
File "/home/wangjialin/Community-Detection/GraphSAINT/graphsaint/globals.py", line 30, in
flags.DEFINE_string('log_dir', '.', 'base directory for logging and saving embeddings')
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/site-packages/tensorflow/python/platform/flags. py", line 58, in wrapper
return original_function(*args, **kwargs)
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/site-packages/absl/flags/_defines.py", line 241 , in DEFINE_string
DEFINE(parser, name, default, help, flag_values, serializer, **args)
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/site-packages/absl/flags/_defines.py", line 82, in DEFINE
flag_values, module_name)
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/site-packages/absl/flags/_defines.py", line 104 , in DEFINE_flag
fv[flag.name] = flag
File "/home/wangjialin/Lib/anaconda3/lib/python3.7/site-packages/absl/flags/_flagvalues.py", line 430, in setitem
raise _exceptions.DuplicateFlagError.from_flag(name, self)
absl.flags._exceptions.DuplicateFlagError: The flag 'log_dir' is defined twice. First from absl.logg ing, Se

The Yelp dataset

Hello,

I wonder whether it is possible to release the Yelp dataset in the IPDPS paper. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.