Giter VIP home page Giter VIP logo

tc's Introduction

TC

Preprocess from SNAP data

Data Formats

SNAP

A SNAP dataset http://snap.stanford.edu/ is a text file that each line is an edge with source and destination vertex IDs, such as
11 25

CSR

A CSR data generally contains two lists, the begin position array which represents the offsets of the neighbor lists by source vertex ID, and the adjacency list which stores destination vertices of all edges.

adjacent.bin
begin.bin
head.bin
edge

converter

The codes are in the folder below.

TC/graph_converter/undirected_csr/
./tuple_to_undirected_csr.bin input

It automatically generates CSR files adjacent.bin, head.bin and begin.bin to current path. Each line in SNAP file (e.g., 0,1) generates two edges (0,1) and (1,0). The converted files are bit files. The begin position file begin.bin represents the offsets of each neighbor lists in both head.bin and adjacent.bin. The neighbor lists are not sorted.

cleaner

The codes are in the folder below.

TC/graph_cleaner/
./graph_cleane.bin input_path

The cleaner sorts all neighbor lists, removes self-loop and multiple edges. After this step, the graph is an undirected graph in CSR format (both directions of each edge is stored in the CSR). The output files are written to current folder, so to copy the executable to target folder is suggested.

rank_by_degree

./rank_by_degree.bin input_path

The rank_by_degree orientation tool removes half of all edges. It only keeps the edges from lower degree vertex to higher degree vertex. After this step, the graph is an undirected graph in CSR format. The output files are written to current folder, so to copy the executable to target folder is suggested.

Partition

The codes are in the folder below.

TC/TC-EXT/partitioner-x/
./partition-x input_path

The partitioner tool partition the graph in a 2-D fashion. The rule is described in paper It only keeps the edges from lower degree vertex to higher degree vertex. After this step, the graph is an undirected graph in CSR format. The output files are written to current folder, so to copy the executable to target folder is suggested.

Triangle counting

GPU in-memory

The path of in-memory GPU triangle counting is TC/TC-GPU/work-steal/. Run the code with command ./tc <input_path>.

CPU in-memory

The path of in-memory CPU triangle counting is TC/TC-CPU/tc-ne-cpu/. Run the code with command ./tc <input_path>.

GPU on 2d partitioned data

CPU on 2d partitioned data

Graph Challenge Example Dataset and Toolkit

The dataset path is TC/example_dataset/amazon0302/, the _adj.mmio is the original data format downloaded from https://graphchallenge.mit.edu/data-sets. Converter tool source code is in TC/gConv/, copy the executable to dataset path, line 8 of the bash script converter.sh needs to be modified (set the number to (#vertex+1), the vertex count can be found by less <input>_adj.mmio). Run the bash script with command ./converter.sh <input>_adj.mmio (replace the input name, for example, amazon0302).

tc's People

Contributors

asherliu avatar huyang1988 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

tc's Issues

About the Segmentation fault problem

Hi @huyang1988 ,

I'm glad to see this code for Triangle Counting on GPU.
I compiled the code on Ubuntu 16.04 with CUDA 10.4 and gcc 7.5.
But the Segmentation fault (core dumped) problem occurred when the execution stage. I think the problem occurred because of the data format. Would you please give some details on the data formate you used in the project?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.