Giter VIP home page Giter VIP logo

ex-graph's Introduction

EX-Graph

EX-Graph is a rich dataset containing Ethereum transaction data and X data. With a focus on bridging the gap between anonymous Ethereum activities and the leading social media X's information, it's specifically designed to facilitate various tasks including Ethereum link prediction, wash-trading addresses detection, and matching link prediction between Ethereum addresses and X accounts. By integrating these two different dimensions, EX-Graph provides an on-chain and off-chain resource for researchers and analysts to explore the complex dynamics of Ethereum.

Overview

Additionally, we've made the EX-Graph dataset accessible through a dedicated website EX-Graph. Here, you are welcomed to upload and compare your experiment results on our leaderboard.

Table of Contents

Dataset Overview

EX-Graph dataset consists of several components:

  • X Graph: X accounts and their following relationships.
  • Ethereum Graph: Ethereum addresses and their transaction records.
  • Ethereum Graph with X features: A DGL graph contains Ethereum addresses and their transaction records, as well as Ethereum addresses features got from both Ethereum and matched-X accounts, it is used for Ethereum link prediction task.
  • Wash-trading Addresses Detection Training Graph: Training graph used for wash-trading addresses detection task.
  • Wash-trading Addresses Detection Validation Graph: Validation graph used for wash-trading addresses detection task.
  • Wash-trading Addresses Detection Test Graph: Test graph used for wash-trading addresses detection task.
  • Matching Link Prediction Graph: DGL graph used for matching link prediction task.
  • ethereum_link_prediction: Folder contains models to run Ethereum link prediction experiment and train, validation, test edge indices pickle files.
  • wash_trading_address_detection: Folder contains code of models to run wash trading address detection experiment.
  • matching_link_prediction: Folder contains models to run matching link prediction experiment and train, validation, test edge indices pickle files.
  • X_matching.csv: X accounts and the Ethereum addresses matched with them. We did anonymization operations so that you can see the X numerical id matching with Ethereum addresses without leaking any personal information.
  • dune_wash_trade_tx.csv: Wash-trading transactions downloaded from Dune.

Dataset Schema

We present the original datasets and datasets tailored for use in three distinct tasks. For each dataset, a Google Drive link is provided for easy access and download.

Dataset #Nodes #Edges Node Features Edge Features
X Graph 1,103,509 3,768,281 n.a. n.a.
Ethereum Graph 2,610,465 29,585,858 n.a. from_address, to_address, weight, block_number
Ethereum Graph with X features 1,709,575 13,170,869 X_features, ethereum_features, ethereum_X_combined_features from_address, to_address, weight, block_number
Wash-trading Addresses Detection Training Graph 1,268,607 8,276,446 ethereum_features, X_semantic_features, X_structure_features, ethereum_X_combined_features, wash_trading_label n.a.
Wash-trading Addresses Detection Validation Graph 452,930 1,182,349 ethereum_features, X_semantic_features, X_structure_features, ethereum_X_combined_features, wash_trading_label n.a.
Wash-trading Addresses Detection Test Graph 711,084 2,364,700 ethereum_features, X_semantic_features, X_structure_features, ethereum_X_combined_features, wash_trading_label n.a.
Matching Link Prediction Graph 2,833,659 18,433,366 combined_features n.a.

Getting Started

Installation

  1. Download this repository and extract the datasets to a desired location.

  2. Create a Conda environment using the provided environment.yml file, which includes the requirements to run experiments in this repository. Run:

    conda env create -f environment.yml
    

Requirements

We list main requirements of this repository below. For full requirements, please refer to the provided environment.yml file

  • dgl==1.1.0
  • torch==2.0.0+cu117
  • torch-cluster==1.6.1+pt20cu117
  • torch-geometric==2.3.0
  • torch-scatter==2.1.1+pt20cu117
  • torch-sparse==0.6.17+pt20cu117
  • torch-spline-conv==1.2.2+pt20cu117
  • torchaudio==2.0.1+cu117
  • torchmetrics==0.11.4
  • torchvision==0.15.1+cu117
  • transformers==4.30.2

Dataset Loading

We suppose that you have downloaded the datasets to the corresponding task's folders. Then, you can use the following command to load the data in the pkl format.

e.g. For Ethereum link prediction task, you have downloaded Ethereum Graph with X features in the folder. Then you can use the following command to load the dataset

import dgl
import pickle as pkl
with open('ethereum_graph_with_X_features.pkl', 'rb') as f:
    ethereum_graph_with_X_features = pkl.load(f)

Using EX-Graph

This section details how to using EX-Graph to run benchmark baselines. We explain experiments Ethereum link prediction, wash-trading addresses detection, matching link prediction one by one.

Ethereum Link Prediction

  1. Navigate to ethereum_link_prediction folder, and download Ethereum Graph with X features to this folder.

  2. Each experiment code file is named as model_wo/with_X.py. wo means this experiment does not consider X features. with means this experiment considers X features. For example, if you want to run clusterGCN model considering X features in Ethereum link prediction, the command is as below:

    python clusterGCN_with_X.py
    

Wash-trading Addresses Detection

  1. Navigate to wash_trading_address_detection folder, and download the wash trading graphs: Wash-trading Addresses Detection Training Graph, Wash-trading Addresses Detection Validation Graph, Wash-trading Addresses Detection Test Graph to this folder.

  2. Each experiment code file is named as model_wo/with_X.py. wo means this experiment does not consider X features. with means this experiment considers X features. For example, if you want to run GCN model considering X features, the command is as below:

    python GCN_with_X.py
    

Matching Link Prediction

  1. Navigate to matching_link_prediction folder, and download Matching Link Prediction Graph to this folder.

  2. Each experiment code file is named as model.py. For example, if you want to run GCN model on this matching link prediction, the command is as below:

    python GCN.py
    

License

This dataset is under license CC BY-NC-SA.

Acknowledgement

We thank Ruochen Li for building the website and providing valuable suggestions for this repository.

ex-graph's People

Contributors

persdre avatar

Stargazers

Yunqi Chen avatar  avatar  avatar Shengzhong Zhang avatar  avatar Gong Zheng avatar Rock avatar Junyi avatar lirc572 avatar Han avatar

Watchers

lirc572 avatar  avatar

Forkers

uniqueclouds

ex-graph's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.