Giter VIP home page Giter VIP logo

distmile's Introduction

About

DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding Yuntian He, Saket Gurukar, Pouya Kousha, Hari Subramoni, Dhabaleswar K. Panda, Srinivasan Parthasarathy Published in HiPC 2021

Abstract: DistMILE is a Distributed MultI-Level Embedding framework, which leverages a novel shared-memory parallel algorithm for graph coarsening and a distributed training paradigm for embedding refinement. With the advantage of high-performance computing techniques, DistMILE can smoothly scale different base embedding methods over large networks.

Citation information and link to be added.

Required packages

  • Horovod
  • TensorFlow
  • MVAPICH2-GDR
  • mpi4py
  • numpy
  • scipy
  • scikit-learn
  • networkx
  • gensim (For DeepWalk)
  • theano (For NetMF)

How to Run

Run the main program using the below command:

mpirun_rsh -np ${NUM_GPU} ${GPUS} ${MPI_ENV} python main.py --data ${DATA} --basic-embed ${EMBED} --batch-size ${BATCH_SIZE} --coarsen-m ${COARSEN_M} --coarse-threshold ${COARSE_THRESHOLD} --workers ${NUM_THREADS} --num-threads ${NUM_THREADS} --coarse-parallel

Arguments and variables:

  • -np ${NUM_GPU}: Number of machines for use.
  • ${GPUS}: Hostnames of the machines.
  • ${MPI_ENV}: Environmental variables for MPI. Please use "MV2_USE_CUDA=1 MV2_SUPPORT_DL=1 MV2_ENABLE_AFFINITY=0 MV2_HOMOGENEOUS_CLUSTER=1".
  • --data: Dataset, located in dataset/{$DATA}.
  • --basic-embed: Base embedding method, located in base_embed_methods.
  • --batch-size: Batch size for distributed learning. Default is 100000.
  • --coarsen-m: Coarsen Depth.
  • --coarse-threshold: Threshold for coarsening a graph in parallel. Denoted as $n_{c}$ in the paper. Default is 10000.
  • --workers: # threads for base embedding.
  • --num-threads: # threads for coarsening and refinement. All threads are used by default.
  • --coarse-parallel: Leverage the parallel version of graph coarsening.
  • --large: Only necessary input is sent to TensorFlow model. Not used by default.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.