Hi, I want to use GraKel to compare program's function call graph similarity, each ca

Dear <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Labels are discrete and each label is independent from the others. On the other

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Which algorithm can be used when both node and edge are labeled? about grakel HOT 9 CLOSED

ysig commented on August 16, 2024

Which algorithm can be used when both node and edge are labeled?

from grakel.

Comments (9)

giannisnik commented on August 16, 2024 1

Dear @tiedaoxiaotubie

With regards to your last question, by increasing the number of iterations of the Weisfeiler-Lehman subtree kernel, it is not necessary that the normalized kernel values will keep increasing. Here is a counter-example: suppose we are given a graph G1 that corresponds to the path P5 (path consisting of 5 vertices) and a graph G2 that is the disjoint union of a the cycle C3 (triangle) and P2 (path consisting of 2 vertices). We annotate the vertices of both graphs with the same label (e.g., 1). After the first iteration of the Weisfeiler-Lehman subtree kernel, the label sets of the two graphs are identical, thus the normalized kernel value is equal to 1. At the end of the second interation, the label sets of the two graphs are not identical, thus the normalized kernel value will be smaller than 1.

from grakel.

giannisnik commented on August 16, 2024

Hi @tiedaoxiaotubie ,

The NeighborhoodSubgraphPairwiseDistance kernel can take both node and edge labels into account. You can have a look at this example that uses this kernel: https://github.com/ysig/GraKeL/blob/master/tutorials/digit_classification/digit_classification.ipynb

Note that the node and edge labels need to be discrete. You mentioned that you use execution time as edge label. Does that mean that these labels are real-valued? If this is the case, the kernel would not be able to handle them. One easy workaround is to apply some binning approach and map these execution times to integers.

from grakel.

tiedaoxiaotubie commented on August 16, 2024

Thanks for your answer, that's quite helpful! But I still have 3 questions:

what's the different between label and attribute? it seems attribute must be a vector
In the link you sent to me, I noticed weisfeiler_lehman doesn't ignore the edge label, but depends on the base_kernel, does it means we can still use weisfeiler_lehman in our situation？
how to transform NetworkX edge labeled graphs to GraKeL graphs? In example code, only shows the node labeled situation, I use the following instruction:

G = graph_from_networkx(cfg_all, node_labels_tag='label', edge_labels_tag='execute_counts')

am I right?

from grakel.

giannisnik commented on August 16, 2024

Labels are discrete and each label is independent from the others. On the other hand, attributes can be one-dimensional or multi-dimensional (i.e., vectors) and they are not independent from each other (for instance, 10.0 is more similar to 8.0 than to 1.0).
The Weisfeiler-Lehman framework expects node-labeled graphs and applies an iterative procedure that updates these node labels. As you mention, it can work on top of other kernels, however, it cannot be applied to graphs that do not contain node labels.
Yes, what you are doing is correct.

from grakel.

tiedaoxiaotubie commented on August 16, 2024

Neat! I just tried Weisfeiler-Lehman to calculate the similarity, I found no matter I label edge or not, the graph similarity are always same! That's weird.

Besides, I though the the number of iterations in Weisfeiler-Lehman will affect the result monotonic, but when I add the iterations, the precision didn't rise, so the relationship between iterations number and graph similarity is not monotonic?

from grakel.

giannisnik commented on August 16, 2024

The kernel values are supposed to increase as the number of iterations increase (if not normalized). But it was not clear to me if you exactly meant that. Or do you mean the accuracy (or some other evaluation metric) on the test set of some dataset? Because these two are not related to each other.

from grakel.

tiedaoxiaotubie commented on August 16, 2024

Oh, I forgot to mention that, in fact the data are normalized when I used :

gk = WeisfeilerLehman(n_iter=1, base_graph_kernel=VertexHistogram, normalize=True)

suppose you have several graphs that are very same, after increasing the number of iterations, will the graph similar metrics become more and more high? for example, the graph similarity may be 70% when you iterate 4 times, if you increate the iterate number from 4 to 10, the graph similarity will always higher than 70%?

from grakel.

17853535785 commented on August 16, 2024

Hello, I now have a question similar to yours: based on the use of Weisfeiler-Lehman to calculate the similarity of graph structure, I want to use it to calculate the similarity of graphs with weights (edges of the graph).
I don't know if it can be implemented directly using GraKel, or can it be changed on this basis?

from grakel.

giannisnik commented on August 16, 2024

Hi @17853535785 ,

This has been discussed in a previous thread: #37

from grakel.

Which algorithm can be used when both node and edge are labeled? about grakel HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent