Giter VIP home page Giter VIP logo

gart's Introduction

GART: Graph Analysis on Relational Transactional Datasets

GART is a graph extension that includes an interface to an RDBMS and a dynamic graph store for online graph processing. It is designed to bridge the gap between relational OLTP and graph-based OLAP.

Please to refer GART documentation for more details.

Table of Contents

What is GART

We would like to be able to use graph data flexibly without re-altering the existing relational database system. Moreover, users do not need to be aware of the storage of graph data and the synchronization of data between relational data and graph data for freshness. To fulfill this requirement, we build GART, an in-memory system for real-time online graph computation.

GART uses transactional logs (e.g., binlog) to capture data changes, then recovers data changes into fresh graph data in real time. GART integrates graph computation engines (e.g. GraphScope, NetworkX) to support efficient graph computation processing. The workflow of GART is shown below.

  • 1. Preprocess (Capture & Parser): GART captures data changes from data sources by logs (e.g., Binlogs in SQL systems). Then, it parsers these logs into a recognized format, called as TxnLog. Currently, we use Debezium (for MySQL, PostgreSQL, ...) as the log capture.

    The sample format of an inserted tuple of TxnLog is as follows (Debezium style, only necessary information):

    {
      "before": null,
      "after": {
          "org_id": "0",
          "org_type": "company",
          "org_name": "Kam_Air",
          "org_url": "http://dbpedia.org/resource/Kam_Air"
      },
      "source": {
          "ts_ms": 1689159703811,
          "db": "ldbc",
          "table": "organisation"
      },
      "op": "c"
    }
    

    This sample records the log that inserts a tuple of organisation.

  • 2. Model Convert (RGMapping Converter): This step is an important step for GART. The conversion between different data models for HTGAP workloads requires more semantic information. For example, it needs the mapping between relational tables and vertex/edge types, and the mapping between relational attributes and vertex/edge properties. The GART administrator (such as DBA) can define the rules of relation-graph mapping (RGMapping) once by the interfaces provided by GART. GART will convert relational data changes into graph data changes in the unified logs (UnifiedLog) automatically.

  • 3. Graph Store (Dynamic GStore): GART applies the graph data changes on the graph store. The graph store is dynamic, which means the writes from GART and the reads from the graph analysis processing can be executed on the store concurrently.

Features

Compared to current solutions that provide graph interfaces on relational data, GART has three main features:

Transparent Data Model Conversion

To adapt to rich workload flexibility, GART proposes transparent data model conversion by graph extraction interfaces, which define rules of relational-graph mapping.

We provide a sample definition file called rgmapping-ldbc.yaml.

Efficient Dynamic Graph Storage

To ensure the performance of graph analytical processing (GAP), GART proposes an efficient dynamic graph storage with good locality that stems from key insights into HTGAP workloads, including:

  1. an efficient and mutable compressed sparse row (CSR) representation to guarantee the locality of scanning edges;
  2. a coarse-grained MVCC to reduce the temporal and spatial overhead of versioning;
  3. a flexible property storage to efficiently run various GAP workloads.

Please refer to our paper for specific technical implementation details.

Service-Oriented Deployment Model

GART acts as a service to synchronize database changes to the graph store. When pulled up as a service on its own, users can try out the full power of GART and different graph computation engines on the graph store. At the same time, GART also provides a front-end, used as a database plug-in, currently supported as PostgreSQL extension. Users can invoke GART's functions in the database client, such as RGMapping definitions, graph computation on the graph store, etc.

Getting Started

Requirements

Run GART

Please to refer our documentation.

License

GART is released under Apache License 2.0. Please note that third-party libraries may not have the same license as GART.

Publications

[USENIX ATC' 23] Bridging the Gap between Relational OLTP and Graph-based OLAP. Sijie Shen, Zihang Yao, Lin Shi, Lei Wang, Longbin Lai, Qian Tao, Li Su, Rong Chen, Wenyuan Yu, Haibo Chen, Binyu Zang, Jingren Zhou. USENIX Annual Technical Conference, Boston, MA, USA, July 2023.

gart's People

Contributors

ds-ssj avatar doudoubobo avatar

Stargazers

XUI avatar  avatar Shaeq Ahmed avatar Jeongho Park avatar pomelo avatar 陈夏明 avatar  avatar LorinLee avatar Taoshu avatar  avatar Ke Meng  avatar Ye Cao avatar  avatar Tao He avatar

Watchers

Jingbo Xu avatar Ke Meng  avatar Siyuan Zhang avatar Jingren Zhou avatar  avatar Alibaba OSS avatar  avatar Wenyuan Yu avatar  avatar

Forkers

acezen web-logs2

gart's Issues

Build the document of GART

  • Build the HTML document framework
  • Introduction
  • Getting-started
  • Key Concepts
  • Deployment
  • Use cases
  • Generate gh-pages (preview)
  • Automatic generation workflow
  • Update README
  • Tutorial
    • Configuration
    • RGMapping
    • Graph Analytical Processing

Integration with NetworkX

make NetworkX run over GART storage

  • RPC framework on GART
  • Message proto design
  • passing message through json string
  • serialize json with msgpack
  • enable graph topology ops
  • enable vertex/edge property ops
  • impl NetworkX graph class
  • Impl for GART fragment RPC framework
  • run network app on GART storage
  • installation scripts
  • decouple version with graph server
  • invoke gae and return results
  • support local deployment optimization
  • add docs

[Feature-request] Build the grin src to a library and fulfill the install of CMakeList

Is your feature request related to a problem? Please describe.
When I run

cmake .. && make -j

I found GART does not build the GRIN related source codes.

and run

sudo make install

just install the GAE, not GART.
Please add the building process of GRIN to CMakeList and fulfill the install process

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Enriching GART's use cases

  • Demo plan
  • Enhance the design of transactions (e.g., Smallbank)
  • Fix bugs when using different RGMapping rules
  • Function Enhance: RGMapping checker & formatter
  • Function Enhance: Use MySQL's GTID for consistent epoch assignment
  • Evaluation
  • Update documentation of use cases

GART open-source enhancement

  • dockerfile for GART
  • scripts to launching GART in distributed setting
  • divide GART image to multiple images
  • k8s deployment
  • automatical k8s deployment
  • docs on how to use and deploy GART

Support for Querying Graph Data in PostgreSQL Extension

Need to pull up the GIE service, currently has support for calling GIE through the python interface to the GART store for graph query, you can put this part of the function together in the extension for querying.

SELECT *
FROM gart_query_by_gremlin($$
                           g.V().count()
                           $$);
  • Interfaces for graph definition
  • Design the workflow of query processing
  • Implementation of the interface
  • Log mangement
  • Exit status
  • Query server launching (NetworkX)
  • Transmission of requests and responses
  • Demo: put it together
  • Evaluation: PostgreSQL Recursive, Apache Age Cypher, Apache Age Dijkstra
  • Freshness

GART Helm deployment

Support Helm deployment on k8s

  • Add retry when required services are not ready
  • auto debezium config
  • postgresql support
  • End-to-end demo

Launching the GART service via PostgreSQL Extension

Users can use GART in the form of PostgreSQL Extension, including launching the service, defining transformation (from relational data to graph) rules, and performing graph queries.

  • Building the PostgreSQL Extension framework
  • Launching the GART service via current scripts
  • Manage the file permission of extension ( restricted by the environment in which extension runs in PostgreSQL)
  • Optimize the user commands and interfaces
  • Unified configuration management using configuration files (system file paths, GART configuration, etc.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.