Giter VIP home page Giter VIP logo

Comments (21)

muhanzhang avatar muhanzhang commented on July 26, 2024

Hi, thanks for the questions! For 1), the GNN parameters are shared across different subgraphs. For 2), the enclosing subgraphs can be of different sizes. There is a pooling layer after graph convolutions to summarize node embeddings into a subgraph representation used for predicting the link, thus unifying subgraphs of different sizes to representations of the same size.

GNN is not only used for semi-supervised node classification. It can also be used for graph classification where a dataset of graphs with different sizes are given and their classses are to be predicted.

from seal.

vymao avatar vymao commented on July 26, 2024

I see, thank you. I had some follow-up questions:

1). With regards to the GNN parameter weight matrix, do you just iterate through each training subgraph and use the previous iteration's weight matrix as the starting point?
2). I read that the SortPooling operation is a max-k operation to unify the sizes of the sorted representations of different graph. What happens if the subgraph size is less than k? What is used to fill in these empty features?
3). I assume the following 1-D convolution and dense layered network is shared across all subgraphs?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024
  1. The weight matrix is shared. Each training iteration will take a minbatch of subgraphs, apply the same current weight matrix, calculate the loss gradients, and then update the weight matrix. Then the second batch will use this updated weight matrix to go on training. So on so forth.
  2. All-zero vectors will be filled if size < k.
  3. Yes!

from seal.

vymao avatar vymao commented on July 26, 2024

Ok, thanks! Could you also explain the idea of negative injection? I didn't quite understand it. I see you insert non-existent links into the actual graph for learning embeddings. Is this for node2vec-generated embeddings? At what point do you actually insert these links?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

Yes. It is for node2vec embeddings. If we directly generate node2vec embeddings on the observed network, the embeddings will largely "remember" the observed links. Then when training the GNN, the GNN will go lazy by classifying positive links and negative links only from this part of information thus overfitting. Therefore, we temporally add negative links to the observed graph before generating node2vec embeddings, to alleviate overfitting. But after generating the node2vec embeddings, we remove those negative links again and use the original network to extract enclosing subgraphs.

Nevertheless, our experiments show that adding node2vec embeddings to SEAL is basically neutral to its performance. Thus, we recommend to only use subgraphs and node attributes in SEAL, unless they perform poorly and there exists some node embedding algorithms that learn much better embeddings than node2vec.

from seal.

vymao avatar vymao commented on July 26, 2024

So if you hadn't done negative injection, the prediction would be much worse?

Also, if you believe that the node2vec embeddings might be subpar and that other embedding algorithms could learn the embeddings better, wouldn't that worsen the overfitting problem?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

Right, when using node2vec embeddings in SEAL, negative injection is necessary. But even with negative injection, node2vec embeddings seem neutral on most datasets.

If other embeddings are better than node2vec, then with negative injection they won't overfit and might provide gains to SEAL. But without negative injection indeed they may even worsen the overfitting.

from seal.

vymao avatar vymao commented on July 26, 2024

Going back to the training: you mentioned that "each training iteration will take a minbatch of subgraphs, apply the same current weight matrix, calculate the loss gradients, and then update the weight matrix". What is the minibatch of subgraphs? Are you talking about sampling from the neighbors in the locally enclosing subgraph? If that is the case, then when you say "subgraphs", are you using multiple sets of minibatch neighbors from different nodes?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

No. The minibatch of subgraphs is just a set of subgraphs that SGD train on them together. And there is not more sampling from the neighbors of enclosing subgraphs once the enclosing subgraphs are extracted. Each enclosing subgraph is separated from each other and are completely independent of each other. Please read this paper for more context on graph classification. Thanks!

from seal.

vymao avatar vymao commented on July 26, 2024

Thanks. I read the paper and also briefly looked at the implementation. I had a question about the description; it says:

For each graph signal we also need to have the corresponding adjacency matrices of shape (batch, N, N) or (batch, timesteps, N, N) for temporal and non-temporal data, respectively. While DGCNNs can operate on graphs with different node-counts, C should always be the same and each batch should only contain graphs with the same number of nodes.

1). Should temporal and non-temporal be switched?
2). Is the last statement always true in your implementation, so that you construct minibatches based on identical subgraph size? Is it strictly necessary?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

The referred implementation is from third party. In the official implementation, each batch can contain graphs of different sizes.

from seal.

vymao avatar vymao commented on July 26, 2024

Have you tested this framework/modified this framework on directed graphs?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

Yes. ogbl-citation is a directed graph and SEAL still achieves state-of-the-art results, see https://github.com/facebookresearch/SEAL_OGB. But I haven't tried explicitly to model directed edges.

from seal.

vymao avatar vymao commented on July 26, 2024

Interesting. Did you modify SEAL in any way? How did you interpret the directionality of a predicted link existence?

Also, I was curious; the ogbl-ddi dataset doesn't have any node labels. In general, if you don't use given node features/embeddings, do you just use the Double-Radius Node label as the only feature?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

No, I didn't. That is, currently A->B and B->A will be predicted the same. But it is also easy to break such symmetries but modifying the DRNL to assign different node labels to the source and destination nodes.

For ogbl-ddi, yes, I just use the DRNL labels. This is the same for those smaller datasets such as USAir, NS, etc. which I use here.

from seal.

vymao avatar vymao commented on July 26, 2024

I see. So for training in the ogbl-ddi dataset, did you just assume every directed edge was undirected? And in testing in the ogbl-ddi dataset, how did you output what direction the edge would be? Was it random?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

ogbl-ddi is undirected.

from seal.

vymao avatar vymao commented on July 26, 2024

Sorry I meant ogbl-citation

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

I see. So for training in the ogbl-ddi dataset, did you just assume every directed edge was undirected? And in testing in the ogbl-ddi dataset, how did you output what direction the edge would be? Was it random?

No. In pytorch_geometric, undirected graphs are just treated as directed graphs with edges of both directions. It handles directed graphs and undirected graphs the same. The message passing will happen only along edge directions.

For testing in ogbl-citation, the test edges' directions are given. You just output a score for each directed testing edge.

from seal.

vymao avatar vymao commented on July 26, 2024

I see, thanks. When you create the local enclosing subgraph for any pair of nodes, how do you consider the edges that are extending out/into the enclosing subgraph? Do you just ignore those?

Also, how much memory did it take to run this for, say, ogb-collab?

from seal.

muhanzhang avatar muhanzhang commented on July 26, 2024

Those edges are by definition not included in the enclosing subgraph, as not both end nodes are within k-hop neighborhood.

The memory for running ogb-collab is about 20GB.

from seal.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.