Giter VIP home page Giter VIP logo

TiFlink

An experimental materialized view solution based on TiDB/TiKV and Flink with strong consistency support.

Description

TiFlink is intended to provide an easy to use materialized view on TiDB. It's mainly based on Flink and supports Flink StreamSQL syntax. You will only need to specify a SQL query and a JDBC URL to your TiDB, TiFlinkApp will take care of the rest.

TiFlink has no external dependencies other than TiDB/TiKV and Flink so that there is no need to maintain Kafka and TiCDC clusters. It will read/write directly from/to TiKV with parallel tasks, which should maximum the throughput with low latency. TiFlink will also first fully snapshot existing source tables and then automatically turn to CDC stream processing.

TiFlink is designed to provide strong consistency. It makes use of raw transaction information and global timestamps from TiKV to achieve "Stale Snapshot Isolation", which means everytime when you query the target table, you will see a consistent snapshot of the materialized view in some past time. Thus, ACID of transactions on source tables as well as linearizability are kept. For details, please refer to the draft doc.

TiFlink is currently in early preview and not production ready. Bug reports and contributions are welcome.

Key Features

  1. Strong consistency
  2. Minimum deployment dependencies
  3. Read/Write directly from/To TiKV
  4. Unified Batch/Stream processing
  5. Easy to use

Dependencies

  • Java >= 11
  • TiDB/TiKV >= 5.0.0
  • Flink >= 1.13.0
  • tikv/client-java >= 3.2.0 (Pending release)

Usage

See TiFlinkExample.java

TiFlinkApp.newBuilder()
   .setJdbcUrl("jdbc:mysql://root@localhost:4000/test") // Please make sure the user has correct permission
   .setQuery(
       "select id, "
           + "first_name, "
           + "last_name, "
           + "email, "
           + "(select count(*) from posts where author_id = authors.id) as posts "
           + "from authors")
   // .setColumnNames("a", "b", "c", "d") // Override column names inferred from the query
   // .setPrimaryKeys("a") // Specify the primary key columns, defaults to the first column
   // .setDefaultDatabase("test") // Default TiDB database to use, defaults to that specified by JDBC URL
   .setTargetTable("author_posts") // TiFlink will automatically create the table if not exist
   // .setTargetTable("test", "author_posts") // It is possible to sepecify the full table path
   .setParallelism(3) // Parallelism of the Flink Job
   .setCheckpointInterval(1000) // Checkpoint interval in milliseconds. This interval determines data refresh rate
   .setDropOldTable(true) // If TiFlink should drop old target table on start
   .setForceNewTable(true) // If to throw an error if the target table already exists
   .build()
   .start(); // Start the app

TODO

  • Non-Integer and compound primary key support
  • Better Flink task to TiKV Region mapping
  • Automatically clean up uncommitted transaction on restart
  • Unit tests

License

See LICENSE.

tiflink's Projects

starrocks icon starrocks

StarRocks is a next-gen sub-second MPP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics and ad-hoc query.

tidb icon tidb

TiDB is an open source distributed HTAP database compatible with the MySQL protocol

tiflink icon tiflink

An experimental materialized view solution based on TiDB/TiKV and Flink with strong consistency support.

tikv icon tikv

Distributed transactional key-value database, originally created to complement TiDB

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.