muhanzhang / pytorch_dgcnn Goto Github PK

PyTorch implementation of DGCNN

License: MIT License

Python 59.05% Shell 5.26% MATLAB 10.31% Makefile 2.36% C++ 21.64% C 1.37%

pytorch_dgcnn's Introduction

PyTorch DGCNN

About

PyTorch implementation of DGCNN (Deep Graph Convolutional Neural Network). Check https://github.com/muhanzhang/DGCNN for more information.

Requirements: python 2.7 or python 3.6; pytorch >= 0.4.0

Installation

This implementation is based on Hanjun Dai's structure2vec graph backend. Under the "lib/" directory, type

make -j4

to compile the necessary c++ files.

After that, under the root directory of this repository, type

./run_DGCNN.sh

to run DGCNN on dataset MUTAG with the default setting.

Or type

./run_DGCNN.sh DATANAME FOLD

to run on dataset = DATANAME using fold number = FOLD (1-10, corresponds to which fold to use as test data in the cross-validation experiments).

If you set FOLD = 0, e.g., typing "./run_DGCNN.sh DD 0", then it will run 10-fold cross validation on DD and report the average accuracy.

Alternatively, type

./run_DGCNN.sh DATANAME 1 200

to use the last 200 graphs in the dataset as testing graphs. The fold number 1 will be ignored.

Check "run_DGCNN.sh" for more options.

Datasets

Default graph datasets are stored in "data/DSName/DSName.txt". Check the "data/README.md" for the format.

In addition to the support of discrete node labels (tags), DGCNN now supports multi-dimensional continuous node features. One example dataset with continuous node features is "Synthie". Check "data/Synthie/Synthie.txt" for the format.

There are two preprocessing scripts in MATLAB: "mat2txt.m" transforms .mat graphs (from Weisfeiler-Lehman Graph Kernel Toolbox), "dortmund2txt.m" transforms graph benchmark datasets downloaded from https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets

How to use your own data

The first step is to transform your graphs to the format described in "data/README.md". You should put your testing graphs at the end of the file. Then, there is an option -test_number X, which enables using the last X graphs from the file as testing graphs. You may also pass X as the third argument to "run_DGCNN.sh" by

./run_DGCNN.sh DATANAME 1 X

where the fold number 1 will be ignored.

Reference

If you find the code useful, please cite our paper:

@inproceedings{zhang2018end,
  title={An End-to-End Deep Learning Architecture for Graph Classification.},
  author={Zhang, Muhan and Cui, Zhicheng and Neumann, Marion and Chen, Yixin},
  booktitle={AAAI},
  year={2018}
}

Muhan Zhang, [email protected] 3/19/2018

pytorch_dgcnn's People

Contributors

Stargazers

Watchers

Forkers

sucrerouge littlepretty songfgh parsonszeng malizheng hhh920406 darlingfirebox aimeng100 hongyanggao lqf96 currylym william-gx silent567 aliysefian shi27feng zir0-93 ermao0828 egdenis jy5380 xiaolong-yun paulorocosta zzwloveai mingli1988 liangtianxin leicaiwsu mingli-ai zwytop iababio shishi0129 uwuneng denis-xiao liunianxuxie o-zhu cheny124800 candleinwindsteve lyf14020510036 allen15rg omah69 zhongjieyu guangyuanhuang jdc08161063 tmacmilan thangldvn liqiu1234 cavefishy zqplus hkharryking dl8614 azuresilent wsl071134 fioushen paarthgupta communism2050 mirjunaid26 govindjsk ekoeditaa sayuri2333 p577665228 fengweie youngflyasd ukco abel0828 jasmine1004 deshawn-phang qianrenjian dingyuandy thilinicooray freegliboracle tor4k molly00ecla xli13 mymuli kirkzhen eglrp jeozhao imseaton denhim tgpd sysu-roboticslab dizhu-gis nosajmik chaoyongg tluccs liuchuang0059 margaretya gnn2qsu xiexiaqing snowball0823 liqiang1102 bouther joelsoffo shane995 fengkoushangdezzx shixiongjing xuyan724 zhenyu-guo susan1314 zf-zhang zhoublin ankiima

pytorch_dgcnn's Issues

Some errors when processing DGCNN

Hi Muhan, can you help me deal with two errors?
When I tried to follow the Readme to make -j4 under the "lib" it shows like following:
Nothing to be done for `all'.

And when I make the command "./run_DGCNN.sh", it feedbacked the following error:
Traceback (most recent call last):
File "main.py", line 14, in
from DGCNN_embedding import DGCNN
File "/Users/jishilun/Desktop/DGCNN_official/DGCNN_embedding.py", line 17, in
from gnn_lib import GNNLIB
File "/Users/jishilun/Desktop/DGCNN_official/lib/gnn_lib.py", line 87, in
GNNLIB = _gnn_lib(sys.argv)
File "/Users/jishilun/Desktop/DGCNN_official/lib/gnn_lib.py", line 12, in init
self.lib = ctypes.CDLL('%s/build/dll/libgnn.so' % dir_path)
File "/anaconda3/lib/python3.7/ctypes/init.py", line 356, in init
self._handle = _dlopen(self._name, mode)
OSError: dlopen(/Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so, 6): no suitable image found. Did find:
/Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00
/Users/jishilun/Desktop/DGCNN_official/lib/build/dll/libgnn.so: unknown file type, first eight bytes: 0x7F 0x45 0x4C 0x46 0x02 0x01 0x01 0x00

Above all are the errors, which I can not handle.
So Can you tell me where the error occurs?
Thank you very much!

Reproduce results in the original paper

Hi,
I am trying to reproduce the results in the original paper with the code. I run the code with

./run_DGCNN.sh DD 0

on datasets protein, D&D, Collab and IMDB. I got accuracy around 60 for all of them, which is lower than the results in the paper. Is there any way I can tune the model to reproduce the results?

Thank you

The accuracy of prediction on MUTAG dataset

Hi, I ran the code by its default setting which is training on MUTAG, the learning rate and other parameters are remained as the default value. However, I can only achieve 72.22% accuracy after 300 epochs of training, which is different from the reported 85.83% in the paper. Could you give any suggestions to fix this?

Suffling dataset

Hello. Is there a way to shuffle the dataset? I have the graphs in the .txt, but unfortunately, the similarly labeled graphs are grouped together. Is there a way to shuffle them before running?

About feature

How to output the feature of the last layer of the neural network of each graph in the list of graphs to be tested ?

I have a little question about code

`def loop_dataset(g_list, classifier, sample_idxes, optimizer=None, bsize=cmd_args.batch_size):
total_loss = []
total_iters = (len(sample_idxes) + (bsize - 1) * (optimizer is None)) // bsize
pbar = tqdm(range(total_iters), unit='batch')

n_samples = 0
for pos in pbar:
    selected_idx = sample_idxes[pos * bsize : (pos + 1) * bsize]

    batch_graph = [g_list[idx] for idx in selected_idx]
    _, loss, acc = classifier(batch_graph)

    if optimizer is not None:
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    loss = loss.data.cpu().numpy()
    pbar.set_description('loss: %0.5f acc: %0.5f' % (loss, acc) )

    total_loss.append( np.array([loss, acc]) * len(selected_idx))

    n_samples += len(selected_idx)
if optimizer is None:
    assert n_samples == len(sample_idxes)
total_loss = np.array(total_loss)
avg_loss = np.sum(total_loss, 0) / n_samples
return avg_loss, acc`

This function return a acc for test_acc, but this acc was just the last batch's acc, is this a mistake? This function was in main.py

dataset

Hello author, I would like to ask how to make MAT file in the data set. I am new to graph nerve and do not understand the production of data set

I have a little question about the code

in DGCNN_embedding.py line38，it seems that you set the kernel size as 0，does it make any sense ?

Visualization

Hi, professor. Thanks to you for providing the code. Could you tell me how to export HD images in MeshLab or other software (as shown in the paper)?

Run with error

Hi,
Thanks for the contribution.
I have a problem when importing the compiled dgnn. When reading libgnn.so the error: OSError: "[WinError 126] The specified module could not be found" was raised.

Thanks,

Segmentation Fault error on windows10

When running the "python main.py", it would produced segmentation fault error.

Question about parameter selection

Hello,Professor,

Could you please tell me about how did you find the parameters associated with each data set in the code? Grid search or manual parameter call?

Thanks Very much!

implementation details

Thanks for the release of the pytorch version of DGCNN.
However, it seems that it is implemented based on s2v rather than graph network.
So, which one is the proper version (torch or pytorch) to reproduce the results?

为什么找不到s2v_lib，embedding和pytorch_util文件

您好，师兄！
from s2v_lib import S2VLIB
from pytorch_util import weights_init, gnn_spmm
from embedding import EmbedMeanField, EmbedLoopyBP
无法导入！

How to save the trained model and use for final test like demo?

Hi,
I am training the model with my own dataset and I am saving the model with
torch.save(classifier.state_dict(), PATH)

However, when I load it, it shows IncompatibleKeys(missing_keys=[], unexpected_keys=[])
It looks that, it saves nothing? How can I save it?

One other question is that. if validation loss looks very low. Then how can I use that trained model with some unknown graph for graph class prediction?

How to install in windows?

I want to run in windows env. how to do install?

Saving Model and Predicting classes

How can I save the model trained? What line of code needs to be modified?

Moreover, how can I load the saved model in order to predict the label of a new NetworkX graph?

How to use Synthie?

Hi, to use Synthie dataset , is it enough to change the dataset name in run_DGCNN.sh to Synthie? I am a bit confused.

Problem when using Mac M1

Hi Dr. Zhang,
When I tried to run this code on Mac M1, I met this problem:
OSError: dlopen(/Users/PycharmProjects/test/pytorch_DGCNN/lib/build/dll/libgnn.so, 0x0006): tried: '/Users/PycharmProjects/test/pytorch_DGCNN/lib/build/dll/libgnn.so' (mach-o file, but is an incompatible architecture (have (arm64), need (x86_64)))
What should I do to solve this problem? Could you please give me some advise? Thank you!

How to convert to multilabel classifier

Hello,

How can i convert the code for multilevel graphs classifier?

Thank's-

What is the tag if node?

Hi,
I read your data format description:

"the i-th line describes the information of ith node (0 based), which starts with t m, where t is the tag of current node, and m is the number of neighbors of current node",

Can you explain what is the tag of node ?

About the environment

Hello author, thank you very much for the method you put forward which is very suitable for my task. However, when I install the corresponding version of package and run the program according to your requirements, the following problems have been unable to be solved. May I ask what went wrong?

Traceback (most recent call last):
File "C:/Users/Administrator/Desktop/share_file/GNN/graph_classification/pytorch_DGCNN-master/main.py", line 14, in
from DGCNN_embedding import DGCNN
File "C:\Users\Administrator\Desktop\share_file\GNN\graph_classification\pytorch_DGCNN-master\DGCNN_embedding.py", line 17, in
from gnn_lib import GNNLIB
File "C:\Users\Administrator\Desktop\share_file\GNN\graph_classification\pytorch_DGCNN-master/lib\gnn_lib.py", line 87, in
GNNLIB = gnn_lib(sys.argv)
File "C:\Users\Administrator\Desktop\share_file\GNN\graph_classification\pytorch_DGCNN-master/lib\gnn_lib.py", line 12, in init
self.lib = ctypes.CDLL('%s/build/dll/libgnn.so' % (dir_path))
File "C:\Anaconda3\envs\pytorch\lib\ctypes_init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

questions about social network datasets

Hi，I'm reading your paper nowadays and confused about social network datasets.
Taking IMDB as an example, I want to know what a input graph and nodes (of it) denote respectively.
I think each input graph denotes a movie review, and nodes in the graph represent words in the corresponding text respectively.And edges represent word co-occurrence information, depending on the size of window sliding.
If so, what is X(inital node input feature)?
If I get it wrong, then what does they denote?
Can you help me answer these questions? Thank you very much!

Using my own dataset ?

Hi ! I am using my own dataset named PHY on DGCNN. I have shaped my data as mentioned in the readme file.
However, my data is big enough (300 000) and I prefer to proceed without 10-fold cross-validation (training on 230 000-testing on 70 000). For this, I ran ./run_DGCNN.sh Phy 1 70000. Despite the fact that it's training and the loss is decreasing, I am getting a 0 test accuracy and the following message:
_UndefinedMetricWarning: No positive samples in y_true, true positive value should be meaningless._I googled the warning and apparently it thinks all my labels are 0 (while I should be having 4 class labels). What do you think the problem is ?

Also, could you point out where I can find how much Dropout is used after a Dense layer, and where I can check the number of Fully-Connected layers used? (not the #of nodes just the # of hidden layers).

Using DGCNN for nodes with multidimensional attributes

Hi~
I'm trying to follow your work and using it for weather forecasting. The nodes in my dataset has multidimensional attributes. And I had saw DGCNN can work for it in you paper. But I couldn't find this procedure and the data format in you code. Also in the datasets which you provided in this pytorch_DGCNN, the nodes only has the tag rather than multidimensional attributes. Do I need to write the multidimensional attributes procedure code myself?

About node feature

Hi, thanks for the amazing work. I have a question, as far as I know, In ENZYMES, every node in every graph has features, but in your 'proteins.txt', you seemed to only use node label, I donnot know why.

question about node features

hello,Muhan. Can you help me to solve this qusstion?
My graph fall into three categories. Both of them have the same structures ,only the featuers of nodes in the graph are different. i want to know your code can do this work?
thx

load_data

with open('data/%s/%s.txt' % (cmd_args.data, cmd_args.data), 'r') as f:
    n_g = int(f.readline().strip())
    for i in range(n_g):
        row = f.readline().strip().split()
        n, l = [int(w) for w in row]   ???? 一行数据如果多于两个值，就无法赋值了啊？报错，为什么啊

row = [0,1,7]
ValueError: too many values to unpack (expected 2)

question about datasets

Hello, I am a beginner of graph classification task. Except the datasets you used in the code, I also want to try other dataset like 'KKI' and 'BZR', which can be downloaded from https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets. But I don't know how to get the .mat file. And I don't really understand the .mat file. How can I change the _A.txt _edge_labels.txt _graph_indicator.txt _graph_labels.txt _node_labels.txt to the .mat file?
Could you please help me ? Thanks!!

Question about accuracy

Hi, I have a question about how to report the accuracy. Through experiments, you reported the accuracy of the convergence result after a fixed number of epochs. For example, your code runs MUTAG for 300 epochs and reports the accuracy of the 300th epoch for each fold and averages over 10 fold, which is eventually around 85%. Actually, if we adopt the early stop and only report the best accuracy for each fold, we could achieve the accuracy of 91.6%. I'm curious why didn't you do that to get a better result.

Are these hyperparameters the final version?

Are these hyperparameters the final version? I have tried many times, and the classification accuracy of some datasets is quite different from that described in the paper.

Is it possible to use this pipeline with Dataset more than 2 classes? And how to do that?

Compile about C ++

Hello my env is python 3.6 with pytorch 1.0. So i can nt use your 32 bits files which was compiled by c++ files. Can you provide 64 bits files that are recompiled? Thank you!

AttributeError: type object 'MySpMM' has no attribute 'apply'

Hi!
I got this error when running with python 2.7.15 under gpu mode

Traceback (most recent call last):
  File "main.py", line 187, in <module>
    avg_loss = loop_dataset(train_graphs, classifier, train_idxes, optimizer=op\
timizer)
  File "main.py", line 131, in loop_dataset
    logits, loss, acc = classifier(batch_graph)
  File "/home/wyw10804/anaconda2/envs/env_DGCNN/lib/python2.7/site-packages/tor\
ch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 107, in forward
    embed = self.s2v(batch_graph, node_feat, None)
  File "/home/wyw10804/anaconda2/envs/env_DGCNN/lib/python2.7/site-packages/tor\
ch/nn/modules/module.py", line 206, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wyw10804/ssl-graph-classification/benchmarks/pytorch_DGCNN-master\
/DGCNN_embedding.py", line 71, in forward
    h = self.sortpooling_embedding(node_feat, edge_feat, n2n_sp, e2n_sp, subg_s\
p, graph_sizes, node_degs)
  File "/home/wyw10804/ssl-graph-classification/benchmarks/pytorch_DGCNN-master\
/DGCNN_embedding.py", line 87, in sortpooling_embedding
    n2npool = gnn_spmm(n2n_sp, cur_message_layer) + cur_message_layer  # Y = (A\
 + I) * X
  File "/home/wyw10804/ssl-graph-classification/benchmarks/pytorch_DGCNN-master\
/pytorch_structure2vec-master/s2v_lib/pytorch_util.py", line 70, in gnn_spmm
    return MySpMM.apply(sp_mat, dense_mat)
AttributeError: type object 'MySpMM' has no attribute 'apply'

Can you help me with this issue? Thanks!

Error raised in using the DGCNN implementation

Hello Dr. Zhang,

Thank you for this code.Can you please help me with the below issue?

I am using SEAL that exploits DGCNN. When I am running the file run_DGCNN.sh or trying to import functions from main.py in pytorch_DGCNN, it gives me "OSError: [WinError 193] %1 is not a valid Win32 application" error.
Kindly help.

Thanks in advance!

TypeError while trying to run ./run_DGCNN.sh

While trying to run ./run_DGCNN.sh I get the same error each time I run for different datasets:
TypeError: new(): data must be a sequence (got dict_values)

>> ./run_DGCNN.sh

====== begin of s2v configuration ======
| msg_average = 0
======   end of s2v configuration ======
Namespace(batch_size=50, data='DD', dropout=True, extract_features=False, feat_dim=0, fold=1, gm='DGCNN', hidden=128, latent_dim=[32, 32, 32, 1], learning_rate=1e-05, max_lv=4, mode='cpu', num_class=0, num_epochs=200, out_dim=0, printAUC=False, seed=1, sortpooling_k=0.6, test_number=0)
loading data
# classes: 2
# maximum node tag: 82
# train: 1061, # test: 117
k used in SortPooling is: 291
Initializing DGCNN
  0%|                                                                                                                                                                 | 0/21 [00:00<?, ?batch/s]Traceback (most recent call last):
  File "main.py", line 187, in <module>
    avg_loss = loop_dataset(train_graphs, classifier, train_idxes, optimizer=optimizer)
  File "main.py", line 131, in loop_dataset
    logits, loss, acc = classifier(batch_graph)
  File "/home/egorc/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "main.py", line 107, in forward
    embed = self.s2v(batch_graph, node_feat, None)
  File "/home/egorc/.local/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/egorc/pytorch_DGCNN/DGCNN_embedding.py", line 53, in forward
    node_degs = [torch.Tensor(graph_list[i].degs) + 1 for i in range(len(graph_list))]
  File "/home/egorc/pytorch_DGCNN/DGCNN_embedding.py", line 53, in <listcomp>
    node_degs = [torch.Tensor(graph_list[i].degs) + 1 for i in range(len(graph_list))]
TypeError: new(): data must be a sequence (got dict_values)

Steps to reproduce:

Install DGCNN:
- Clone repository
- unzip pytorch_structure2vec-master.zip
- cd pytorch_structure2vec-master/s2vlib/
- make -j4
- cd ../..
Change python to python3 in run_DGCNN.sh
Change gpu_or_cpu=gpu to gpu_or_cpu=cpu in run_DGCNN.sh
Comment import cPickle as cp
Uncomment import _pickle as cp # python3 compatability in util.py
Run ./run_DGCNN.sh

Environment:
Distributor ID: Ubuntu Description: Ubuntu 16.04.5 LTS Release: 16.04 Codename: xenial
Python version: Python 3.5.2
Packages:
absl-py (0.5.0) astor (0.7.1) autokeras (0.2.18) blinker (1.3) boto (2.38.0) chardet (2.3.0) cloud-init (18.3) command-not-found (0.3) configobj (5.0.6) cryptography (1.2.3) decorator (4.3.0) gast (0.2.0) google-compute-engine (2.8.2) grpcio (1.15.0) h5py (2.8.0) idna (2.0) Jinja2 (2.8) jsonpatch (1.10) jsonpointer (1.9) Keras (2.2.2) Keras-Applications (1.0.4) Keras-Preprocessing (1.0.2) language-selector (0.1) Markdown (3.0.1) MarkupSafe (0.23) networkx (2.2) numpy (1.15.2) oauthlib (1.0.3) Pillow (5.3.0) pip (8.1.1) prettytable (0.7.2) protobuf (3.6.1) pyasn1 (0.1.9) pycurl (7.43.0) pygobject (3.20.0) PyJWT (1.3.0) pyserial (3.0.1) python-apt (1.1.0b1+ubuntu0.16.4.2) python-debian (0.1.27) python-systemd (231) PyYAML (3.13) requests (2.9.1) scikit-learn (0.20.0) scipy (1.1.0) setuptools (39.1.0) six (1.11.0) sklearn (0.0) ssh-import-id (5.5) tensorboard (1.11.0) tensorflow (1.11.0) termcolor (1.1.0) torch (0.4.1) torchvision (0.2.1) tqdm (4.25.0) ufw (0.35) unattended-upgrades (0.1) urllib3 (1.13.1) Werkzeug (0.14.1) wheel (0.32.1)

.mat files

Hello, I am a beginner of graph classification task. Could you tell me where is the code used to generate .mat files in folder data? If my graph benchmark datasets are downloaded from https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets, do I need only the dortmund2txt.m file instead of the mat2txt.m file?

Error during initialization

Hi all,

thanks for this great tool.
I am using SEAL that exploits DGCNN, and I am getting the next error during the initialization of your tool:

pytorch_DGCNN/main.py:175: RuntimeWarning: invalid value encountered in double_scalars
avg_loss = np.sum(total_loss, 0) / n_samples
Traceback (most recent call last):
File "Main.py", line 241, in
train_graphs, classifier, train_idxes, optimizer=optimizer, bsize=args.batch_size
File "/home/scarlos/Documents/git/SEAL/Python/../../pytorch_DGCNN/main.py", line 176, in loop_dataset
all_scores = torch.cat(all_scores).cpu().numpy()
NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Any idea?
Thanks.

Torch not compiled with CUDA enabled in CPU mode

When i run the command ./run_DGCNN.sh in CPU mode i get the following error.

I set the var gpu_or_cpu=cpu, and i don't have cuda on this pc.

How can i fix this error?

====== begin of s2v configuration ======
| msg_average = 0
====== end of s2v configuration ======
Namespace(batch_size=50, data='DD', dropout=True, extract_features=False, feat_dim=0, fold=10, gm='DGCNN', hidden=128, latent_dim=[32, 32, 32, 1], learning_rate=1e-05, max_lv=4, mode='cpu', num_class=0, num_epochs=200, out_dim=0, printAUC=False, seed=1, sortpooling_k=0.6, test_number=0)
loading data

classes: 2

maximum node tag: 82

train: 1061, # test: 117

k used in SortPooling is: 291
Initializing DGCNN
0%| | 0/21 [00:00<?, ?batch/s]Traceback (most recent call last):
File "main.py", line 187, in
avg_loss = loop_dataset(train_graphs, classifier, train_idxes, optimizer=optimizer)
File "main.py", line 131, in loop_dataset
logits, loss, acc = classifier(batch_graph)
File "/home/mat/Escritorio/DGCNN/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "main.py", line 107, in forward
embed = self.s2v(batch_graph, node_feat, None)
File "/home/mat/Escritorio/DGCNN/env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/mat/Escritorio/DGCNN/pytorch_DGCNN/DGCNN_embedding.py", line 58, in forward
if isinstance(node_feat, torch.cuda.FloatTensor):
File "/home/mat/Escritorio/DGCNN/env/lib/python3.6/site-packages/torch/cuda/init.py", line 161, in _lazy_init
_check_driver()
File "/home/mat/Escritorio/DGCNN/env/lib/python3.6/site-packages/torch/cuda/init.py", line 75, in _check_driver
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

libgnn shared object library

Hi, may I ask about the motivation behind writing libgnn in C (i.e. better performance)? Is it possible for me to implement it entirely in python? I am thinking of implementing it entirely in networkx to remove the dependence on the shared object library.

Thank you!

some issues about the code in pytorch_embeding.py

Hello, I'd like to know , in line 38, why the stride is set to sum(latent_dim) as well? when I run the code, I've met some problems:
RuntimeError: Calculated padded input size per channel: (1). Kernel size: (5). Kernel size can't be greater than actual input size.

Multiple GPUs

How can one change the code to allow it make use of all the available GPUs?

Is there any support for visualisation like precision, recall, model_visualisation, training_loss, validation_losss?

Hi,
My dataset consists of 20 classes and I will also add more class in future. Node contains only label integer as features. So is it still possible to classify whole graph? I think it should work because for sort pooling layer, it doesn#t matter whether nodes have label fetaures or continous features.
Am I correct?
and can you give also some support for visualisation of network?

An error about SEAL-master python code Main.py

Hi Muhan, I have an error when I run the code Main.py on windows,
Can you give me some advice?
Thank you so much~

Traceback (most recent call last):
File "E:/iSS/SEAL-master/Python/Main.py", line 12, in
from main import *
File "E:\iSS\SEAL-master\Python/pytorch_DGCNN\main.py", line 14, in
from DGCNN_embedding import DGCNN
File "E:\iSS\SEAL-master\Python/pytorch_DGCNN\DGCNN_embedding.py", line 17, in
from gnn_lib import GNNLIB
File "E:\iSS\SEAL-master\Python\pytorch_DGCNN/lib\gnn_lib.py", line 87, in
GNNLIB = gnn_lib(sys.argv)
File "E:\iSS\SEAL-master\Python\pytorch_DGCNN/lib\gnn_lib.py", line 12, in init
self.lib = ctypes.CDLL('%s/build/dll/libgnn.so' % dir_path)
File "E:\Python\Anaconda3\lib\ctypes_init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 193] %1 不是有效的 Win32 应用程序。

About Node Features

Hi, I am a beginner of graph classification task. I have a question that if the node has a string(natural language string) attribute, how can I convert the string into node feature?
Could I use word embedding model, such as word2vec, to convert the string into node feature?
Thanks!

Question Direct Graph

Hello,Professor,

Could you give me some hints on modifying the code to fit in the directed graph data?

I have modified util.py (load_data()), but I get the error like RuntimeError: size is inconsistent with indices: for dim 0, size is 1877 but found index 3626

i have a question about the code

n_samples = 0
for pos in pbar:
    selected_idx = sample_idxes[pos * bsize : (pos + 1) * bsize]

    batch_graph = [g_list[idx] for idx in selected_idx]
    _, loss, acc = classifier(batch_graph)

    if optimizer is not None:
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    loss = loss.data.cpu().numpy()
    pbar.set_description('loss: %0.5f acc: %0.5f' % (loss, acc) )

    total_loss.append( np.array([loss, acc]) * len(selected_idx))

    n_samples += len(selected_idx)
if optimizer is None:
    assert n_samples == len(sample_idxes)
total_loss = np.array(total_loss)
avg_loss = np.sum(total_loss, 0) / n_samples
return avg_loss, acc`

This function return a acc for test_acc, but actually this acc was last batch's acc,not test fold's acc.Is this a mistake?

'tag of current node'?

Hi,
many thanks for the PyTorch implementation of the DGCNN.

Do you have a python implementation (instead of the Matlab script) to generate the input text file from the individual text files (_graph_labels.txt, _graph_indicator.txt, _A.txt)? I am not familiar with Matlab. What is this following line in the script doing? What is the equivalent in python code?
A = spones(sparse(edges(:,1), edges(:,2), edge_attr(:,1), num_nodes, num_nodes))

Also, what do you mean by 'tag of current node'? Why are there for example 14 rows with 'node tag'=2 but different number of neigbors in the following example of yours?

188 # total number of graphs
23 2 # number of nodes in the first graph (23), graph label of first graph (2)
2 2 1 13 # tag of current node (2), number of neighbors (2), index first neighbor (1), index second neighbor (13)
2 2 0 2
2 3 1 3 11
2 2 2 4
2 2 3 5
2 3 4 6 10
2 3 5 7 20
2 2 6 8
2 2 7 9
2 3 8 10 15
2 3 5 9 11
2 3 2 10 12
2 3 11 13 14
2 2 0 12

Thanks,
Marie

Error arised when running the program

python: src/lib/msg_pass.cpp:20: void n2n_construct(GraphStruct*, long long int*, Dtype*): Assertion `nnz == (int)graph->num_edges' failed.

could help to check this problem?

invalid ELF header?

while in the pytorch_DGCNN directory i ran git clone https://github.com/Hanjun-Dai/pytorch_structure2vec and ran make -j4 while in ./s2v_lib/ i then moved all the contents of pytorch_structure2vec into pytorch_DGCNN since main.py expects s2v.py to be in the same directory.
when running i get the following error

Traceback (most recent call last):
  File "main.py", line 14, in <module>
    from DGCNN_embedding import DGCNN
  File "/vagrant/DGCNN_embedding.py", line 16, in <module>
    from s2v_lib import S2VLIB
  File "/vagrant/s2v_lib.py", line 127, in <module>
    S2VLIB = _s2v_lib(sys.argv)
  File "/vagrant/s2v_lib.py", line 11, in __init__
    self.lib = ctypes.CDLL('%s/build/dll/libs2v.so' % dir_path)
  File "/usr/lib/python2.7/ctypes/__init__.py", line 366, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /vagrant/build/dll/libs2v.so: invalid ELF header