Giter VIP home page Giter VIP logo

subway's Introduction

Subway

Subway is an out-of-GPU-memory graph processing framework.

Subway provides a highly cost-effective solution to extracting a subgraph that only consists of the edges of active vertices. This allows it to transfer only the active parts of the graph from CPU to GPU, thus dramatically reduces the volume of data transfer. The benefits from the data transfer reduction outweigh the costs of subgraph generation in (almost) all iterations of graph processing, bringing in substantial overall performance improvements. Moreover, it supports asynchronous processing between the loaded subgraph in GPU and the rest of the graph in host memory, which tends to decrease the number of global iterations, thus can further reduce the data transfer.

Compilation

To compile Subway, just run make in the root directory. The only requrements are g++ and CUDA toolkit.

Input graph formats

Subway accepts edge-list (.el) and weighted edge-list (.wel) graph formats, as well as the binary serialized pre-built CSR graph representation (.bcsr and .bwcsr). It is highly recommended to convert edge-list format graph files to the binary format (using tools/converter). Reading binary formats is faster and more space efficient.

Subway is sensitive to graph file extension. A weighted edge-list graph file has to end with .wel. The followings are two graph file examples.

Graph.el ("SOURCE DESTINATION" for each edge in each line):

0 1
0 3
2 3
1 2

Graph.wel ("SOURCE DESTINATION WEIGHT" for each edge in each line):

0 1 26
0 3 33
2 3 40
1 2 10

To convert these graph files to the binary format, run the following commands in the root folder:

tools/converter path_to_Graph.el
tools/converter path_to_Graph.wel

The first command converts Graph.el to the binary CSR format and generates a binary graph file with .bcsr extension under the same directory as the original file. The second command converts Graph.wel to a weighted binary graph file with .bwcsr extension.

Running applications in Subway

The applications take a graph as input as well as some optional arguments. For example:

$ ./sssp-async --input path-to-input-graph
$ ./sssp-async --input path-to-input-graph --source 10

For applications that run on weighted graphs, like SSSP, the input must be weighted (.bwcsr or .wel) and for applications that run on unweighted graphs, like BFS, the input must be unweighted (.bcsr or .el).

Publications:

[EUROSYS'20] Amir Hossein Nodehi Sabet, Zhijia Zhao, and Rajiv Gupta. Subway: minimizing data transfer during out-of-GPU-memory graph processing. In Proceedings of the Fifteenth European Conference on Computer Systems.

[ASPLOS'18] Amir Hossein Nodehi Sabet, Junqiao Qiu, and Zhijia Zhao. Tigr: Transforming Irregular Graphs for GPU-Friendly Graph Processing. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems.

subway's People

Contributors

amirnodehi avatar automatalab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

subway's Issues

When can we see the project source code?

Dear author of Subway,
Hi, I am reading your article these days and want to see the source code to deepen understanding. In your readme, you promised the source code will be posted around mid-May, but May is coming to an end. I would appreciate it if you can reply me in this issue. Thanks!

error

After make, run the instruction. / bfs-sync -- input. / edge_ 875713.el

The tips are as follows:
Reading the input graph from the following file:

    ./edge_875713.el
    Done reading.
    Number of nodes = 875713
    Number of edges = 9238952
    Graph Reading finished in 1.80066 (s).
    GPUassert: out of memory subgraph.cu 40

Why is it out of memory

little bug

After running with a simple graph,a segmentation fault occurred
check the graph.cu I found that in line 156

                        uint *outDegreeCounter  = new uint[num_nodes];
			uint location;  
			for(uint i=0; i<num_edges; i++)
			{
				location = nodePointer[edges[i].source] + outDegreeCounter[edges[i].source];
				edgeList[location].end = edges[i].end;
				//if(isWeighted)
				//	edgeList[location].w8 = edges[i].w8;
				outDegreeCounter[edges[i].source]++;  
			}

The outDegreeCounter array should be initialized

Off-By-One Error in Subgraph Generation

On line 75 in subgraph_generator.cu, ceil() is useless with integer division.
Should be unsigned int chunkSize = (unsigned int)ceil((double)numActiveNodes / numThreads);

Results in not copying some data to the edge array, sending the old values to the kernel which results in undefined behavior; usually erroneous values in the last .01% of the nodes (but can propagate through the rest of the graph).

Cannot processes graphs with |E| > 2**32

Hi,

Throughout your entire code, you use uint to represent your pointers to the edges.
I was attempting to preprocess an RMAT dataset similar to how you have in the paper and the preprocessing causes the graph to have a mere 5GB of memory.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.