Giter VIP home page Giter VIP logo

struc2vec's Introduction

struc2vec

This repository provides a reference implementation of struc2vec as described in the paper:

struc2vec: Learning Node Representations from Structural Identity.
Leonardo F. R. Ribeiro, Pedro H. P. Saverese, Daniel R. Figueiredo.
Knowledge Discovery and Data Mining, SigKDD, 2017.

The struc2vec algorithm learns continuous representations for nodes in any graph. struc2vec captures structural equivalence between nodes.

Before to execute struc2vec, it is necessary to install the following packages:
pip install futures
pip install fastdtw
pip install gensim

Update

Python 3 version: https://github.com/sebkaz/struc2vec/tree/master

Basic Usage

Example

To run struc2vec on Mirrored Zachary's karate club network, execute the following command from the project home directory:
python src/main.py --input graph/karate-mirrored.edgelist --output emb/karate-mirrored.emb

Options

To activate optimization 1, use the following option: --OPT1 true
To activate optimization 2: --OPT2 true
To activate optimization 3: --OPT3 true

To run struc2vec on Barbell network, using all optimizations, execute the following command from the project home directory:
python src/main.py --input graph/barbell.edgelist --output emb/barbell.emb --num-walks 20 --walk-length 80 --window-size 5 --dimensions 2 --OPT1 True --OPT2 True --OPT3 True --until-layer 6

You can check out the other options available to use with struc2vec using:
python src/main.py --help

Input

The supported input format is an edgelist:

node1_id_int node2_id_int

Output

The output file has n+1 lines for a graph with n vertices. The first line has the following format:

num_of_nodes dim_of_representation

The next n lines are as follows:

node_id dim1 dim2 ... dimd

where dim1, ... , dimd is the d-dimensional representation learned by struc2vec.

Miscellaneous

Please send any questions you might have about the code and/or the algorithm to [email protected].

Note: This is only a reference implementation of the framework struc2vec.

struc2vec's People

Contributors

leoribeiro avatar lolemacs avatar lookfwd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

struc2vec's Issues

a problem about Barbell graph

Thank you for your code!
i use the file struc2vec/emb/barbell.emb provided by your code to draw the latent representations,but i can't get the result in your paper figure2(e)
Look forward to your answer and thanks again!

Python3

The current solution is with Python2. Is there any python3 version of codebase available?

RuntimeWarning: invalid value encountered in double_scalars

When i run the code with Cora dataset ,the following occurs:
/home/songzenghui/zee/struc2vec/src/algorithms_distances.py:541: RuntimeWarning: invalid value encountered in double_scalars e_list = [x / sum_w for x in e_list]
The dataset has 2708 nodes,5429 edges,and i ran python src/main.py --input graph/cora.edgelist --output emb/cora.embeddings --OPT1 True --OPT2 Ture --OPT3 True
Is the sum_w too small so that the division is invalid?

struc2vec on weighted graph

The arguments args.weighted and args.unweighted are never passed in any module, and the default argument is unweighted = True. Do the weighted/unweighted arguments have any effect?

A license

What license does this reference implemention fall under? MIT?

issue attribute

Traceback (most recent call last):
  File "src/main.py", line 128, in <module>
    main(args)
  File "src/main.py", line 121, in main
    G = exec_struc2vec(args)
  File "src/main.py", line 97, in exec_struc2vec
    G = struc2vec.Graph(G, args.directed, args.workers, untilLayer = until_layer)
  File "/Users/anton/Downloads/struc2vec-master/src/struc2vec.py", line 20, in __init__
    self.G = g.gToDict()
  File "/Users/anton/Downloads/struc2vec-master/src/graph.py", line 127, in gToDict
    for k,v in self.iteritems():
AttributeError: 'Graph' object has no attribute 'iteritems'

Problem with disconnected nodes

I work with some graphs that have some disconnected nodes. I believe that when I create the edges list, which contains only the connected part of the network, the algorithm outputs network representations for only the connected part of the network. Is there any way to include the disconnected nodes, or I should remove them from the network, completely?

ask fo help

Karate-mirrored.edgelist : what are two columns represented? Hope to get your answer,thx

filr

Hello,I can not find emb.karate.emb data,whether you forget put it

Node attributes/labels

If struc2vec able to handle cases with node attributes/labels?
Or is it reasonable to incorporate node attributes/labels with struc2vec?

The problem about the scalability

I tried to use the struc2vec to train on 1 million nodes with 24 threads,but no embeddings was generated after three days.Is there a solution?

plot the embedded nodes

Hi, thanks for your open-source code firstly. But, I have a question: I use t-sne to plot the embedded 128 dimensions vectors in 2D space, It's distribution seem not be similar as the paper with karate dataset figure, so I want to ask if you are using t-sne to visulization? thank you for your kindness!

IOError: bad message length

I am feeding a new undirected graph dataset with (V=18059, E=286535). It comes up with
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/queues.py", line 268, in _feed
send(obj)
IOError: bad message length

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.