Giter VIP home page Giter VIP logo

ethereum-fraud-detection-visualization's Introduction

Ethereum Fraud Detection Visualization


September 2022


Overview

Fraud detection is a process that detects and prevents fraudsters from obtaining money or property through false means. It is a set of activities undertaken to detect and block the attempt of fraudsters from obtaining money or property fraudulently. Fraud is an expensive and complicated problem. To detect and investigate it effectively, you need to see connections โ€“ between people, accounts, transactions, and dates โ€“ and understand complex sequences of events. That means analyzing a lot of data. Fraud detection is prevalent across banking, insurance, medical, government, and public sectors, as well as in law enforcement agencies.

Advantages of Visualizations in Fraud Detection:

The detection of fraud schemes requires an investigation of a vast amount of data that stems from many different anti-fraud systems with varying types of data. The auditors have to combine all the data and use statistical methods to uncover suspicious claims, which is time-consuming and inefficient in most cases.

Visualizations, on the other hand, can enhance the quick identification of relationships and significant structures and the detection of suspicious patterns that may hide in the amount of data. Besides the visual exploration, interaction with the data allows for a deeper understanding of the dependencies within the data changing over time.

One of the most challenging tasks when using visualization for fraud detection is the sheer amount of data that is usually obtained by auditing systems. First, the auditor has to retrieve the data from the auditing system. Visualizing such a large amount of data is the next challenge: the data needs a meaningful arrangement to create a human-readable representation. Providing suitable styling should enable users to identify different types of entities and relations.

Since there exist a lot of different types of fraud schemes, it is clear that there is no unique solution that can detect all of them. Thus, a visualization meant to fight against fraud has to be adaptive to the needs of each auditor.

At first, it must not limit to a specific amount or type of data since the volume of the investigated data grows exponentially and comes from different sources. In some cases, it is also necessary to be able to support and visualize time-dependent data.

A sophisticated visualization should also provide the means for arranging the elements in multiple ways on the screen, i.e., using arrangements that reveal clusters or others that highlight hierarchical structures. Additionally, more sophisticated graph analysis algorithms should be supported for the detection of fraud schemes, e.g., cycle detection, or shortest paths.

Regarding the representation of the elements of the visualization, an auditor should be able to customize the look and feel of the graph elements based on his/her needs and be able to display additional properties of the graph elements. Finally, interaction is one of the essential operations when visualizing fraud data since it allows the auditor to explore its dataset.

Fraud detection can be separated by the use of statistical data analysis techniques or artificial intelligence.

Statistical data analysis techniques include:

  1. calculating statistical parameters
  2. regression analysis
  3. probability distributions and models
  4. data matching

AI techniques used to detect fraud include:

  1. Data mining classifies, groups and segments data to search through millions of transactions to find patterns and detect fraud.
  2. Neural networks learn suspicious-looking patterns and use those patterns to detect them further.
  3. Machine learning automatically identifies characteristics found in fraud.
  4. Pattern recognition detects classes, clusters and patterns of suspicious behavior.

Cryptocurrency fraud analysts look at huge volumes of historical data spanning long time periods. Our main idea is to comprehensively examine and visualize the available data related to fraud detection in the Ethereum network.

Our suggested steps to visualize data:

  1. Downloading and collecting data
  2. Data cleaning
  3. Data statistics and distribution
  4. Comparing different features of data between fraud and non-fraud classes

Datasets

We will use two data set in this report.

  1. Ethereum Fraud Detection Dataset
  2. Ethereum Fraud Dataset

We will analyze these two datasets both individually and in combination.

Number of Features Total Cases Fraud Cases Non-Fraud Cases
Ethereum Fraud Detection Dataset 37 9816 2179 7637
Ethereum Fraud Dataset 31 12146 5150 6996
Merged Dataset 17 20302 5675 14627

Table1. Datasets Overview

Requirements

  1. Python >= 3.5
  2. pandas >= 0.24.2
  3. matplotlib >= 3.0.3
  4. seaborn >= 0.9.1
  5. numpy >= 1.18.5
  6. notebook >= 5.7.4
  • Run pip install -r requirements.txt or pip3 install -r requirements.txt

Notebooks

GitHub Viewer NB Viewer Google Colab
Ethereum Fraud Detection Dataset Link Link Link
Ethereum Fraud Dataset Link Link Link
Merged Dataset Link Link Link

Table2. Notebooks

Visualization Example

Here you can see a limited number of examples. The full version of this visualization and all codes can be seen in the notebooks!

Fig1. Data Distribution Pie Diagram


Fig2. Most Received Token Type Pie Diagram (Fraud Cases)


Fig3. Received Transactions Different Statistics Comparing


Fig4. Features Correlation Diagram


Fig5. Features Distribution Diagram

Cite

If you use this repo in your work, please cite it using the following metadata:

Haghighi, S., & Ramezani, F. (2022). Ethereum Fraud Detection Models (Version 1.0) [Computer software]. https://github.com/sepandhaghighi/Ethereum-Fraud-Detection-Models
@software{Haghighi_Ethereum_Fraud_Detection_2022,
author = {Haghighi, Sepand and Ramezani, Farzad},
license = {MIT},
month = {10},
title = {{Ethereum Fraud Detection Models}},
url = {https://github.com/sepandhaghighi/Ethereum-Fraud-Detection-Models},
version = {1.0},
year = {2022}
}

ethereum-fraud-detection-visualization's People

Contributors

farziiii avatar sepandhaghighi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gkdapp

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.