cszhangzhen / hgp-sl Goto Github PK

View Code? Open in Web Editor NEW

313.0 3.0 51.0 1.24 MB

Hierarchical Graph Pooling with Structure Learning

Python 100.00%

graph graph-neural-networks graph-pooling graph-classification graph-level-representation gcn pytorch graph-algorithms

hgp-sl's People

Contributors

Stargazers

Watchers

hgp-sl's Issues

How to run on DD dataset？

Hi, thanks for your code. But when I run python main.py --dataset DD, there is an error as follows:

RuntimeError: CUDA out of memory. Tried to allocate 18.75 GiB (GPU 0; 15.90 GiB total capacity; 2.36 GiB already allocated; 10.80 GiB free; 559.64 MiB cached)

Predicting Same label for every data in custom dataset

Thank you very much for thi implementation. It was really helpful

Currently, I am training a dataset of my own, for human activity recognition. I have converted the skeleton data into graph structure.

Number of class = 75
Number of features = 4

edge_index is always same for every graph.

While training, my model is always predicting the same label for every data that's why loss is not decreasing and accuracy is not increasing. Is there any modifications I further need to do to run?

I have a question about the way you split train/val/test dataset.
The result of datasets you use is extremely influenced by split method. In your code, you set a seed 777 and then split the dataset randomly. I want to know the your result is based on the same seed or differents seed.

Looking forword to your reply.

Use of this work for regression and classification

Hi!
Great Work you have been doing. I am not a NN expert, but I think that the idea is major for many problems. I would like to use that GNN for testing something. However, due to my imbalanced dataset, the classification leads to a single class output. I guess I could change the type of this NN from classification to regression right? I already changed last linear layer to output 1 without activation and changed the input. However, i still get an error. Do you know if its possible to quickly change it to a regression problem?

Thank you

AssertionError assert trust_data or int(col.max()) < N

Hello, when I reproduce your code, when the model is trained normally until the 22nd epoch, the code has an error. Have you encountered this situation here? Do you have any ideas on how to solve it? Looking forward to your answer, thank you !!

Epoch: 0023 loss_train: 1.181976 acc_train: 0.696629 loss_val: 0.594611 acc_val: 0.720721 time: 119.426900s
Traceback (most recent call last):
File "D:\pycharmprojects\pythongraph\HGP-SL-master\main.py", line 130, in
best_model = train()
File "D:\pycharmprojects\pythongraph\HGP-SL-master\main.py", line 74, in train
out = model(data)
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "D:\pycharmprojects\pythongraph\HGP-SL-master\models.py", line 39, in forward
x, edge_index, edge_attr, batch = self.pool1(x, edge_index, edge_attr, batch)
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "D:\pycharmprojects\pythongraph\HGP-SL-master\layers.py", line 204, in forward
hop_data = self.neighbor_augment(hop_data)
File "D:\pycharmprojects\pythongraph\HGP-SL-master\layers.py", line 22, in call
index, value = spspmm(edge_index, value, edge_index, value, n, n, n, True)
File "D:\Anaconda3\lib\site-packages\torch_sparse\spspmm.py", line 30, in spspmm
C = matmul(A, B)
File "D:\Anaconda3\lib\site-packages\torch_sparse\matmul.py", line 140, in matmul
return spspmm(src, other, reduce)
File "D:\Anaconda3\lib\site-packages\torch_sparse\matmul.py", line 117, in spspmm
return spspmm_sum(src, other)
File "D:\Anaconda3\lib\site-packages\torch_sparse\matmul.py", line 107, in spspmm_sum
sparse_sizes=(M, K), is_sorted=True)
File "D:\Anaconda3\lib\site-packages\torch_sparse\tensor.py", line 38, in init
trust_data=trust_data,
File "D:\Anaconda3\lib\site-packages\torch_sparse\storage.py", line 79, in init
assert trust_data or int(col.max()) < N
AssertionError

About the acc from this repo and the paper

Hi, your work is excellent. However I find a gap between results obtained via your code and results reported in your paper. Specifically:

	Mutagenicity	NCI109	NCI1	DD
Code	79.68(1.68)	73.86(1.72)	76.29(2.14)	75.46(3.86)
Paper	82.15(0.58)	80.67(1.16)	78.45(0.77)	80.96(1.26)

I follow your hyper-parameter settings and do not change any part of your code. Why? I'm using Python 3.7 with the latest version of pytorch and pytorch_geometric. Is it possible that unmatched version of pyg causes such a gap?

User defined Dataset

How do I run this code on a user-defined dataset?

Segmentation error

Hi, I run the code with"python main.py", but it outputs"segmentation error". How can I fix it?

How to apply HGP-SL to dense batched adjacency matrix

Hi, thanks for your code.
In my code, I must change my batched sparse adjacency matrices to a single dense batched adjacency matrix since my adjacency matrix is very big.
However, when I apply HGP-SL to my code, the error happens
ValueError: too many values to unpack (expected 2)
So I check the code, and I found row, col = edge_index in layers.py. which means that the parameter passes to HGP-SL must be sparse adjacency matrix. I really want to use HGP-SL in my code, but I don't know how to change the code in HGP-SL so that I can use dense batched adjacency matrix as the parameter.

About the acc of some Baesline

Hi~ your paper is really an amazing work!
After reading the paper, I became more convinced of this. But now i have a question about the accuracy of the baseline, SAGPool. The belonging paper of this model report the acc is lower more than in your paper, not a little. could you give me some tips about this? thank you!

Datasets Division

Hello, your work is great. But there are some doubts, I hope you can answer them. Thx!
As mentioned in your paper, the experimental results are obtained by running random division 10 times. But I noticed that you did not give the code how to run 10 times. May I ask whether the seed 777 is used when dividing the data set. In other words, a fixed data set division method is used each time and run ten times. And are all the data set seeds 777?

reproducible experiments

Hello, I'm interested in your paper, but I cannot get similar results when running your code. So I try installing pytorch-geometric according to the version you given. However, I counldn't find correct instructions to install pytorch, torch-scatter, torch-sparse, torch-cluster and torch-spline-conv. If I install them according to pyG website，it will install the latest version.

Question about layers of this model

Hi, your paper is really an interesting work!
However I have two questions about the layers used in your model

I find that you use both the GCNConv and another new-defined GCN in your network. Is there any difference between these two layers?
I'm confused when reading your code in NodeInformationScore. I think you use the random walk version of Laplacian (i.e. the I - D^{-1}A ) in your paper. However in the NodeInformationScore implementation you use the symmetric version (i.e. the I - D^{-1/2}AD^{-1/2}). Also, I personally find the implementation has some problems. The edge_weight after add_remaining_self_loops will have self-loops, thus the final result of the norm function is I - D^{-1/2}(A+I)D^{-1/2} = - D^{-1/2}AD^{-1/2} instead of the I - D^{-1/2}AD^{-1/2}.

The issue about acc

Hi! I like your work. So I downloaded the code and ran it 5 times on my desktop. And I found the performance for the ENZYMES dataset looks strange. The seed and the accuracy of the HGP are the followings

0.433333333333333
0.266666666666666
0.266666666666666
0.366666666666666
0.283333333333333

I'm using Python 3.7.4, pytorch==1.3.0, torch-scatter==1.4.0, torch-sparse==0.4.3, torch-cluster==1.4.5 and torch-geometric==1.3.2 as you recommend in the code. What I changed are hyperparameters and the number of layers as you suggested. I must have missed something.. But I don't know what it is. Can you help me?
main.py

model.py

How to draw a graph as beautiful as model.png

I have given a star on this code for its high quality. Strangely,it has only stars less than 100. Maybe it is lack of propaganda. I am a researcher focus on intrusion detection, and GNN may provide a novel idea on anomalous network traffic analysis. Noting that the graph "model.png" is such beautiful,could you tell me how you draw this graph? Visio? PPT? or other powerful tools? And could you share the source file of the graph?

Network architecture on D&D and ENZYMES

hello, this's a excellent paper, but i can not reproduce the previously mentioned accuracy on D&D and ENZYMES. I have read the code and found that the net arch is a 3 layers network, but according to your readme.md file, the network arch of these two dataset is a 2 layers network arch, but i found no details of the 2 layers arch in the paper. So could you tell me the details of the 2 layers arch for D&D and ENZYMES? Wish for your reply.

The accuracy is sensitive to random seed.

After changing the random seed, the performance may fluctuate more than 10%

About the GAT model used in this paper

Hello, thank you for your great contributions! I want to know how to deal with the dimensional problem when I use the GAT model with multi-head, where x3 is with the average feature, while x1 and x2 are with the concatenation feature.
x = F.relu(x1) + F.relu(x2) + F.relu(x3)
this would make a dimensional error.
Thanks!

Recurrence experiment

Hello, I am very interested in your paper. I am reproducing your experiment, but I don’t know why it keeps prompting:TypeError: coalesce() got an unexpected keyword argument 'fill_value'

cszhangzhen / hgp-sl Goto Github PK

hgp-sl's People

Contributors

Stargazers

Watchers

Forkers

hgp-sl's Issues

Recommend Projects

Recommend Topics

Recommend Org