Light

Kernels suitable for continuous adjacency matrices? about grakel HOT 10 CLOSED

ysig commented on August 16, 2024

Kernels suitable for continuous adjacency matrices?

from grakel.

Comments (10)

giannisnik commented on August 16, 2024 1

Hi @marsuplink . Some kernels take into account the weights of the edges. For instance, the shortest path kernel computes the shortest paths between all pairs of nodes. So, if the weight of the edge between nodes A and B is 0.1 and that of the edge between nodes C and D is 0.9, then the distance between A and B is shorter than that between C and D. However, the majority of the kernels ignore the magnitude of these weights. This is the case for the WL subtree kernel: edge weights greater than 0 all are equivalent to each other; they all denote the existence of an (unweighted) edge between the edges' endpoints.

from grakel.

giannisnik commented on August 16, 2024 1

You can indeed use the shortest path kernel along with the SVM classifier. However, I should mention that the implemented shortest path kernel compares shortest path lengths using a dirac kernel. So, the kernel may not perform that well if your graphs have only a few shortest paths with identical length or no shortest paths with identical lengths.

With regards to the performance of the two kernels, that depends a lot on the application. But on most benchmark graph classification datasets, the WL kernel outperforms the shortest path kernel.

from grakel.

giannisnik commented on August 16, 2024 1

Two other kernels that you could use which utilize directly the adjacency matrix and thus take edge weights into account is the pyramid match kernel and the multiscale Laplacian kernel.

from grakel.

giannisnik commented on August 16, 2024 1

If the edge weights are in [0,1], for the shortest path kernel, you could set each positive weight w equal to 1-w. However, since the shortest path kernel compares lengths using a dirac kernel, I don't t this will have any impact on performance.

from grakel.

giannisnik commented on August 16, 2024 1

@marsuplink the kernel takes edge weights into account. For instance, consider the following example:
import numpy as np
from grakel.kernels import MultiscaleLaplacianFast

adj = np.array([[0,0,1,1],[0,0,1,0],[1,1,0,1],[1,0,1,0]])
node_attributes = {0:[1], 1:[1], 2:[1], 3:[1]}
G1 = [adj, node_attributes]

adj = np.array([[0,0,.2,.2],[0,0,.2,0],[.2,.2,0,.2],[.2,0,.2,0]])
node_attributes = {0:[1], 1:[1], 2:[1], 3:[1]}
G2 = [adj, node_attributes]

Gs = [G1,G2]

ml_kernel = MultiscaleLaplacianFast(P=4, L=2, n_samples=4)
K = ml_kernel.fit_transform(Gs)
print(K)

The resulting matrix is:
K=[[0.5 0.01875952]
[0.01875952 0.5 ]]
If the kernel did not take edge weights into account then, the two adjacency matrices would be identical to each other and K[0,1] would be equal to K[0,0] and K[1,1].

from grakel.

marsuplink commented on August 16, 2024

Thanks @giannisnik for the helpful reply! So if I wanted to classify graphs with weighted edges using an SVM, I can use the shortest path kernel for the SVM, would that work well in terms of classification accuracy?

Do you think the WL kernel works better than the shortest path kernel for unweighted graphs?

from grakel.

marsuplink commented on August 16, 2024

Ah that's good to know, thanks! Will definitely give it a try and use the weighted adjacency matrix directly for the pyramid match kernel and the multiscale Laplacian kernel.

p.s. given what you said about the shortest path kernel, i.e. a 0.1 entry indicates short distance between two nodes and 0.9 indicates long distance, I should do some kind of inversion so the adjacency matrix entries going into the gk.fit_transorm() are inversely proportional to the degree? Since usually higher degree corresponds to tighter connection. Or is this taken care of automatically when I give a weighted adjacency matrix.

Thanks Giannis!

from grakel.

marsuplink commented on August 16, 2024

I see, will experiment with this. Thanks!

from grakel.

marsuplink commented on August 16, 2024

@giannisnik thanks again for your help. I'm experimenting with your suggestions, for the multiscale Laplacian kernel, I'm calling it with data as such after initialization: Ks = gk.fit_transform([[weighted_adj, {1: vec1, 2: vec2, ...}], ... ]), where vec1, vec2 are node attribute vectors.

I noticed that the classification accuracy doesn't change much at all after replacing the weighted adjacency matrix with some random matrix (like the identify), am I feeding the data in correctly? I take it that the multiscale Laplacian kernel doesn't discretize the weighted adjacency matrix and just treat nonzero entries as 1's based on what you said earlier.

Thanks!

from grakel.

marsuplink commented on August 16, 2024

Thanks for the example @giannisnik!

from grakel.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.