Giter VIP home page Giter VIP logo

mixmicro-daas-pdk's Introduction

iDaaS PDK

GitHub stars License

What is iDaaS

Data as a service (DaaS) is a data management strategy that uses the cloud(or centralized data platform) to deliver data storage, integration, processing, and servicing capabilities in an on demand fashion.

Tapdata iDaaS, or Incremental Data as a Service, is an open source implementation(Soon to be released) of the DaaS architecture that puts focus on the incremental updating capability of the data platform. Please refer to Tapdata iDaaS Project for more information.

What is PDK for iDaaS

PDK is Plugin Development Kit for iDaaS. Provide an easier way to develop data source connectors

  • Database connectors
    • MySql, Oracle, Postgres, etc.
  • SaaS connectors
    • Facebook, Salesforce, Google docs, etc.
  • Custom connectors
    • Connect your custom data sources.

PDK connectors provide data source to flow in iDaaS pipeline to process, join, etc, and flow in a target which is also provided by PDK connectors.

Quick Start

Java 8+ and Maven need to be installed.

Generate Java Sample Project, then start filling the necessary methods to implement a PDK connector.

@TapConnectorClass("spec.json")
public class SampleConnector extends ConnectorBase implements TapConnector {
    
    @Override
    public void discoverSchema(TapConnectionContext connectionContext, Consumer<List<TapTable>> consumer) {
        //TODO Load schema from database, connection information in connectionContext#getConnectionConfig
        //Sample code shows how to define tables with specified fields.
        consumer.accept(list(
                //Define first table
                table("empty-table1")
                        //Define a field named "id", origin field type, whether is primary key and primary key position
                        .add(field("id", "VARCHAR").isPrimaryKey(true).partitionKeyPos(1))
                        .add(field("description", "TEXT"))
                        .add(field("name", "VARCHAR"))
                        .add(field("age", "DOUBLE"))
        ));
    }

    private void streamRead(TapConnectorContext connectorContext, Object offset, Consumer<List<TapEvent>> consumer) {
        //TODO using CDC API or log to read stream records from database, use consumer#accept to send to incremental engine.
        //Below is sample code to generate stream records directly
        while (!isShutDown.get()) {
            List<TapEvent> tapEvents = list();
            for (int i = 0; i < 10; i++) {
                TapInsertRecordEvent event = insertRecordEvent(map(
                        entry("id", counter.incrementAndGet()),
                        entry("description", "123"),
                        entry("name", "123"),
                        entry("age", 12)
                ), connectorContext.getTable());
                tapEvents.add(event);
            }
            consumer.accept(tapEvents);
        }
    }

    private void writeRecord(TapConnectorContext connectorContext, List<TapRecordEvent> tapRecordEvents, Consumer<WriteListResult<TapRecordEvent>> consumer) {
        //TODO write records into database
        //Need to tell incremental engine the write result
        consumer.accept(writeListResult()
                .insertedCount(tapRecordEvents.size()));
    }
}

Each connector need to provide a json file like above, described in TapConnector annotation.

spec.json describes three types information,

  • properties
    • Connector name, icon and id.
  • configOptions
    • Describe a form for user to input the information for how to connect and which database to open.
  • dataTypes
    • This will be required when current data source need create table with proper types before insert records. Otherwise, this key can be empty.
    • Describe the capability of types for current data source.
    • iDaaS Incremental Engine will generate proper types when table creation is needed.
{
  "properties": {
    "name": "PDK Sample Connector",
    "icon": "icon.jpeg",
    "id": "pdk-sample"
  },
  "configOptions": {
    "connection": {
      "type": "object",
      "properties": {
        "host": {
          "type": "string",
          "title": "Host",
          "required": true,
          "x-decorator": "FormItem",
          "x-component": "Input"
        },
        "port": {
          "type": "number",
          "title": "Port",
          "required": true,
          "x-decorator": "FormItem",
          "x-component": "Input"
        },
        "database": {
          "type": "string",
          "title": "Database",
          "required": true,
          "x-decorator": "FormItem",
          "x-component": "Input"
        }
      }
    }
  },
  "dataTypes": {
    "double": {"bit": 64, "to": "TapNumber"},
    "decimal[($precision,$scale)]": {"precision": [1, 27], "defaultPrecision": 10, "scale": [0, 9], "defaultScale": 0, "to": "TapNumber"}
  }
}

How to create PDK java project?

Requirements

For Windows

Install from source code. Please use gitbash to execute below commands for Windows,

git clone https://github.com/tapdata/idaas-pdk.git
cd idaas-pdk
mvn clean install

There are three types project templates. Assume "group" is "io.tapdata" and "name" is "XDB" and "version" is "0.0.1".

  • Target (Need create table before insertion)
./bin/tap template --type targetNeedTable --group io.tapdata --name XDB --version 0.0.1 --output ./connectors
  • Target (No table creation requirement)
./bin/tap template --type target --group io.tapdata --name XDB --version 0.0.1 --output ./connectors
  • Source
./bin/tap template --type source --group io.tapdata --name XDB --version 0.0.1 --output ./connectors

Use IntelliJ IDE to open idaas-pdk, you can find it named "xdb-connector" under connectors module.

Development Guide

PDK need developer to provide below,

  • Provide data types expression in spec.json for the database which need create table before record insertion.
  • Implement the methods that PDK required.
  • Use TDD to verify that the methods are implemented correctly.

Get started

Test Driven Development

Provide json file which given the values of "configOptions" required from "spec.json" file.

{
    "connection": {
      "host": "192.168.153.132",
      "port": 9030,
      "database": "test"
    }
}

The json file can be outside of git project as you may don't want to expose your password or ip/port, etc.

To run TDD command,

./bin/tap tdd --testConfig xdb_tdd.json ./connectors/xdb-connector

Once PDK connector pass TDD tests, proves that the PDK connector is ready to work in Tapdata. You can also contribute the PDK connector to idaas-pdk open source project.

How to contribute to idaas-pdk

  • Fork idaas-pdk
  • Create ${database}-connector module under connectors
  • Implemented the methods that PDK required, and the features you want to provide
  • Pass TDD tests
  • Submit Pull Request
  • Will merge after review

Roadmap

Check out our Roadmap for Core and our Roadmap for Connectors on GitHub. You'll see the features we're currently working on or about to. You may also give us insights, by adding your own issues and voting for specific features / integrations.

mixmicro-daas-pdk's People

Contributors

dobybros avatar leon7492 avatar losofo avatar tjworks avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.