Comments (10)
Hi @marsuplink . Some kernels take into account the weights of the edges. For instance, the shortest path kernel computes the shortest paths between all pairs of nodes. So, if the weight of the edge between nodes A and B is 0.1 and that of the edge between nodes C and D is 0.9, then the distance between A and B is shorter than that between C and D. However, the majority of the kernels ignore the magnitude of these weights. This is the case for the WL subtree kernel: edge weights greater than 0 all are equivalent to each other; they all denote the existence of an (unweighted) edge between the edges' endpoints.
from grakel.
You can indeed use the shortest path kernel along with the SVM classifier. However, I should mention that the implemented shortest path kernel compares shortest path lengths using a dirac kernel. So, the kernel may not perform that well if your graphs have only a few shortest paths with identical length or no shortest paths with identical lengths.
With regards to the performance of the two kernels, that depends a lot on the application. But on most benchmark graph classification datasets, the WL kernel outperforms the shortest path kernel.
from grakel.
Two other kernels that you could use which utilize directly the adjacency matrix and thus take edge weights into account is the pyramid match kernel and the multiscale Laplacian kernel.
from grakel.
If the edge weights are in [0,1], for the shortest path kernel, you could set each positive weight w equal to 1-w. However, since the shortest path kernel compares lengths using a dirac kernel, I don't t this will have any impact on performance.
from grakel.
@marsuplink the kernel takes edge weights into account. For instance, consider the following example:
import numpy as np
from grakel.kernels import MultiscaleLaplacianFast
adj = np.array([[0,0,1,1],[0,0,1,0],[1,1,0,1],[1,0,1,0]])
node_attributes = {0:[1], 1:[1], 2:[1], 3:[1]}
G1 = [adj, node_attributes]
adj = np.array([[0,0,.2,.2],[0,0,.2,0],[.2,.2,0,.2],[.2,0,.2,0]])
node_attributes = {0:[1], 1:[1], 2:[1], 3:[1]}
G2 = [adj, node_attributes]
Gs = [G1,G2]
ml_kernel = MultiscaleLaplacianFast(P=4, L=2, n_samples=4)
K = ml_kernel.fit_transform(Gs)
print(K)
The resulting matrix is:
K=[[0.5 0.01875952]
[0.01875952 0.5 ]]
If the kernel did not take edge weights into account then, the two adjacency matrices would be identical to each other and K[0,1] would be equal to K[0,0] and K[1,1].
from grakel.
Thanks @giannisnik for the helpful reply! So if I wanted to classify graphs with weighted edges using an SVM, I can use the shortest path kernel for the SVM, would that work well in terms of classification accuracy?
Do you think the WL kernel works better than the shortest path kernel for unweighted graphs?
from grakel.
Ah that's good to know, thanks! Will definitely give it a try and use the weighted adjacency matrix directly for the pyramid match kernel and the multiscale Laplacian kernel.
p.s. given what you said about the shortest path kernel, i.e. a 0.1 entry indicates short distance between two nodes and 0.9 indicates long distance, I should do some kind of inversion so the adjacency matrix entries going into the gk.fit_transorm()
are inversely proportional to the degree? Since usually higher degree corresponds to tighter connection. Or is this taken care of automatically when I give a weighted adjacency matrix.
Thanks Giannis!
from grakel.
I see, will experiment with this. Thanks!
from grakel.
@giannisnik thanks again for your help. I'm experimenting with your suggestions, for the multiscale Laplacian kernel, I'm calling it with data as such after initialization: Ks = gk.fit_transform([[weighted_adj, {1: vec1, 2: vec2, ...}], ... ])
, where vec1
, vec2
are node attribute vectors.
I noticed that the classification accuracy doesn't change much at all after replacing the weighted adjacency matrix with some random matrix (like the identify), am I feeding the data in correctly? I take it that the multiscale Laplacian kernel doesn't discretize the weighted adjacency matrix and just treat nonzero entries as 1's based on what you said earlier.
Thanks!
from grakel.
Thanks for the example @giannisnik!
from grakel.
Related Issues (20)
- Print <class 'grakel.graph.Graph'>
- The Weisfeiler-Lehman Edge Kernel
- fetch_dataset not working HOT 1
- Graph hopper kernel unable to transfom the data
- Edge labels not working for custom dataset HOT 1
- Fingerprint dataset not accessible using fetch_dataset function HOT 13
- NaN error when using Random walk kernel on certain datasets HOT 3
- I don't know how to use my own data for input HOT 5
- MultiDiGraph: graph_from_networkx() ValueError HOT 1
- modified random walk kernel giving all 1 scores for normalize=true HOT 1
- Difference between ShortestPath and ShortestPathAttr HOT 2
- Generating graphs from dense/numpy matrices HOT 1
- Graph.clone() or Graph.copy() HOT 3
- Explicit graph kernels HOT 2
- Graphlet kernel HOT 1
- NeighborhoodSubgraphPairwiseDistance kernel returns diagonal elements less than 1 HOT 5
- One vs many comparison for WeisfeilerLehman; `.transform` gives `TypeError: each element of X must have either a graph with...` HOT 1
- RWK matrix returns by fit_transform by RandomWalkLabeled() contains only 1.0 HOT 7
- Error when using EdgeHistogram HOT 1
- Can't install on Windows 10, keep getting same error HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from grakel.