Giter VIP home page Giter VIP logo

data-sets-surf-repository's Introduction

LDBC benchmark data sets

The LDBC benchmark data sets are stored under SURF's CWI repositories.

๐Ÿ’ก The LDBC SNB Business Intelligence (BI) workload's data sets are stored in Cloudflare R2. See the links to the BI data sets.

๐Ÿ’ก The LDBC SNB Interactive v2 workload's data sets are stored in Cloudflare R2. See the links to the Interactive v2 data sets and update streams.

Usage

The data sets are stored on tape, therefore, you may have to stage them before they can be downloaded. To do so, visit the repository of the data set and click "Request" for offline files. Staging a 20 GB file takes approx. 3-5 minutes, while staging a 200 GB one takes approx. 10-15 minutes.

To decompress, use curl and zstd.

curl --silent --fail set_url_here | tar -xv --use-compress-program=unzstd

We provide the download-data-set.sh script, which attempts to download the data set and stages it to disk if necessary. Replace the data_set_url with one of the URLs linked below in this README (right click and select Copy Link Address).

./download-data-set.sh data_set_url

Example:

./download-data-set.sh https://repository.surfsara.nl/datasets/cwi/snb/files/social_network-csv_basic-longdateformatter/social_network-csv_basic-longdateformatter-sf0.1.tar.zst

LDBC Graphalytics

๐Ÿ“ฅ Repository

Graph and validation data sets

data set number of vertices number of edges
cit-Patents.tar.zst 3774768 16518947
com-friendster.tar.zst 65608366 1806067135
datagen-7_5-fb.tar.zst 633432 34185747
datagen-7_6-fb.tar.zst 754147 42162988
datagen-7_7-zf.tar.zst 13180508 32791267
datagen-7_8-zf.tar.zst 16521886 41025255
datagen-7_9-fb.tar.zst 1387587 85670523
datagen-8_0-fb.tar.zst 1706561 107507376
datagen-8_1-fb.tar.zst 2072117 134267822
datagen-8_2-zf.tar.zst 43734497 106440188
datagen-8_3-zf.tar.zst 53525014 130579909
datagen-8_4-fb.tar.zst 3809084 269479177
datagen-8_5-fb.tar.zst 4599739 332026902
datagen-8_6-fb.tar.zst 5667674 421988619
datagen-8_7-zf.tar.zst 145050709 340157363
datagen-8_8-zf.tar.zst 168308893 413354288
datagen-8_9-fb.tar.zst 10572901 848681908
datagen-9_0-fb.tar.zst 12857671 1049527225
datagen-9_1-fb.tar.zst 16087483 1342158397
datagen-9_2-zf.tar.zst 434943376 1042340732
datagen-9_3-zf.tar.zst 555270053 1309998551
datagen-9_4-fb.tar.zst 29310565 2588948669
datagen-sf10k-fb.tar.zst 33484375 2912009743
datagen-sf3k-fb.tar.zst 100218750 9404822538
dota-league.tar.zst 61170 50870313
example-directed.tar.zst 10 17
example-undirected.tar.zst 9 12
graph500-22.tar.zst 2396657 64155735
graph500-23.tar.zst 4610222 129333677
graph500-24.tar.zst 8870942 260379520
graph500-25.tar.zst 17062472 523602831
graph500-26.tar.zst 32804978 1051922853
graph500-27.tar.zst 63081040 2111642032
graph500-28.tar.zst 121242388 4236163958
graph500-29.tar.zst 232999630 8493569115
graph500-30.tar.zst 447797986 17022117362
kgs.tar.zst 832247 17891698
twitter_mpi.tar.zst 52579678 1963263508
wiki-Talk.tar.zst 2394385 5021410

Graphs as sparse matrices in Matrix Market format

data set number of vertices number of edges
matrix-market/cit-Patents.tar.zst 3774768 16518947
matrix-market/com-friendster.tar.zst 65608366 1806067135
matrix-market/datagen-7_5-fb-bool.tar.zst 633432 34185747
matrix-market/datagen-7_5-fb-fp64.tar.zst 633432 34185747
matrix-market/datagen-7_6-fb-bool.tar.zst 754147 42162988
matrix-market/datagen-7_6-fb-fp64.tar.zst 754147 42162988
matrix-market/datagen-7_7-zf-bool.tar.zst 13180508 32791267
matrix-market/datagen-7_7-zf-fp64.tar.zst 13180508 32791267
matrix-market/datagen-7_8-zf-bool.tar.zst 16521886 41025255
matrix-market/datagen-7_8-zf-fp64.tar.zst 16521886 41025255
matrix-market/datagen-7_9-fb-bool.tar.zst 1387587 85670523
matrix-market/datagen-7_9-fb-fp64.tar.zst 1387587 85670523
matrix-market/datagen-8_0-fb-bool.tar.zst 1706561 107507376
matrix-market/datagen-8_0-fb-fp64.tar.zst 1706561 107507376
matrix-market/datagen-8_1-fb-bool.tar.zst 2072117 134267822
matrix-market/datagen-8_1-fb-fp64.tar.zst 2072117 134267822
matrix-market/datagen-8_2-zf-bool.tar.zst 43734497 106440188
matrix-market/datagen-8_2-zf-fp64.tar.zst 43734497 106440188
matrix-market/datagen-8_3-zf-bool.tar.zst 53525014 130579909
matrix-market/datagen-8_3-zf-fp64.tar.zst 53525014 130579909
matrix-market/datagen-8_4-fb-bool.tar.zst 3809084 269479177
matrix-market/datagen-8_4-fb-fp64.tar.zst 3809084 269479177
matrix-market/datagen-8_5-fb-bool.tar.zst 4599739 332026902
matrix-market/datagen-8_5-fb-fp64.tar.zst 4599739 332026902
matrix-market/datagen-8_6-fb-bool.tar.zst 5667674 421988619
matrix-market/datagen-8_6-fb-fp64.tar.zst 5667674 421988619
matrix-market/datagen-8_7-zf-bool.tar.zst 145050709 340157363
matrix-market/datagen-8_7-zf-fp64.tar.zst 145050709 340157363
matrix-market/datagen-8_8-zf-bool.tar.zst 168308893 413354288
matrix-market/datagen-8_8-zf-fp64.tar.zst 168308893 413354288
matrix-market/datagen-8_9-fb-bool.tar.zst 10572901 848681908
matrix-market/datagen-8_9-fb-fp64.tar.zst 10572901 848681908
matrix-market/datagen-9_0-fb-bool.tar.zst 12857671 1049527225
matrix-market/datagen-9_0-fb-fp64.tar.zst 12857671 1049527225
matrix-market/datagen-9_1-fb-bool.tar.zst 16087483 1342158397
matrix-market/datagen-9_1-fb-fp64.tar.zst 16087483 1342158397
matrix-market/datagen-9_2-zf-bool.tar.zst 434943376 1042340732
matrix-market/datagen-9_2-zf-fp64.tar.zst 434943376 1042340732
matrix-market/datagen-9_3-zf-bool.tar.zst 555270053 1309998551
matrix-market/datagen-9_3-zf-fp64.tar.zst 555270053 1309998551
matrix-market/datagen-9_4-fb-bool.tar.zst 29310565 2588948669
matrix-market/datagen-9_4-fb-fp64.tar.zst 29310565 2588948669
matrix-market/datagen-sf10k-fb-bool.tar.zst 33484375 2912009743
matrix-market/datagen-sf10k-fb-fp64.tar.zst 33484375 2912009743
matrix-market/datagen-sf3k-fb-bool.tar.zst 100218750 9404822538
matrix-market/datagen-sf3k-fb-fp64.tar.zst 100218750 9404822538
matrix-market/dota-league-bool.tar.zst 61170 50870313
matrix-market/dota-league-fp64.tar.zst 61170 50870313
matrix-market/example-directed-bool.tar.zst 10 17
matrix-market/example-directed-fp64.tar.zst 10 17
matrix-market/example-undirected-bool.tar.zst 9 12
matrix-market/example-undirected-fp64.tar.zst 9 12
matrix-market/graph500-22.tar.zst 2396657 64155735
matrix-market/graph500-23.tar.zst 4610222 129333677
matrix-market/graph500-24.tar.zst 8870942 260379520
matrix-market/graph500-25.tar.zst 17062472 523602831
matrix-market/graph500-26.tar.zst 32804978 1051922853
matrix-market/graph500-27.tar.zst 63081040 2111642032
matrix-market/graph500-28.tar.zst 121242388 4236163958
matrix-market/graph500-29.tar.zst 232999630 8493569115
matrix-market/graph500-30.tar.zst 447797986 17022117362
matrix-market/kgs-bool.tar.zst 832247 17891698
matrix-market/kgs-fp64.tar.zst 832247 17891698
matrix-market/twitter_mpi.tar.zst 52579678 1963263508
matrix-market/wiki-Talk.tar.zst 2394385 5021410

Social Network Benchmark (SNB) Interactive v1

๐Ÿ“ฅ Repository

SNB Interactive v1: CsvBasic serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvBasic serializer using StringDateFormatter

SNB Interactive v1: CsvComposite serializer using LongDateFormatter

These data sets were correctly generated unlike the other data sets using the LongDateFormatter. Feel free to use them.

SNB Interactive v1: CsvComposite serializer using StringDateFormatter

SNB Interactive v1: CsvCompositeMergeForeign serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvCompositeMergeForeign serializer using StringDateFormatter

SNB Interactive v1: CsvMergeForeign serializer using LongDateFormatter

These data sets were incorrectly generated, see the related issue, hence we removed their links. The correctly generated data sets will be deployed in the autumn of 2022.

SNB Interactive v1: CsvMergeForeign serializer using StringDateFormatter

SNB Interactive v1: TTL serializer

Substitution parameters

All: substitution_parameters.tar.zst

Update streams

SF0.1

SF0.3

SF1

SF3

SF10

SF30

SF100

SF300

SF1000


Labelled Subgraph Query Benchmark (LSQB)

๐Ÿ“ฅ Repository

Merged FK

Projected FK


SIGMOD 2014 Programming Contest

๐Ÿ“ฅ Repository

Data sets used in the original contest

New data sets


Social Network Benchmark (SNB) Business Intelligence (BI)

๐Ÿ“ฅ Repository

TBA

data-sets-surf-repository's People

Contributors

szarnyasg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.