Giter VIP home page Giter VIP logo

presentation-kafka-connect's Introduction

presentation-kafka-connect

💬 Description

Kafka Connect Demo

The purpose of this repository is to :

  • provide a slideshow that explains how Kafka Connect can be used, its principles and guidelines to set it up to connect Kafka and Psql
  • A set of script based on Docker images and shell scripts to let people run the demo by themselves
  • Show some use cases

📚 Prerequisites

🚀 How to use

In this way to use kafka-connect, we will deploy the entire stack on docker environnement. Here we use the Confluent elements :

  • schema-registry - "It provides a RESTful interface for storing and retrieving your Avro, JSON Schema, and Protobuf schemas."
  • kafka-connect - "Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other data systems."

See also Kafka Connect Tutorial on Docker

1. Kafka Connect plugins

kafka-connect need plugins to interact with databases, download them and extract the content in the plugins folder :

See also Discover Kafka connectors and more

2. Deployment

Now, we can launch the environment :

$ docker-compose --project-name kafka-connect-pgsql -f kafka-connect.yml up -d
Creating postgres-sink   ... done
Creating postgres-source ... done
Creating zookeeper       ... done
Creating kafka           ... done
Creating confluent-registry ... done
Creating confluent-connect  ... done

Check if confluent-connect container is up and ready :

$ docker logs confluent-connect | grep started
...
[2021-01-08 00:32:41,837] INFO REST resources initialized; server is started and ready to handle requests (org.apache.kafka.connect.runtime.rest.RestServer)
[2021-01-08 00:32:41,837] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect)

3. Connectors creations

For create connectors to extract/inject datas between Kafka and external databases, we utilize curl commands trough the api kafka-connect endpoints :

source - Database to Kafka

curl -X POST http://localhost:8083/connectors \
    -H "Content-Type: application/json" \
    --data @connectors/postgresql-source.json 

We define in postgresql-source.json file the connection informations and the elements to watch (the table employees in this case). All updated rows will send on topic hrdata.public.employees

{
    "name": "postgresql-source-connector",  
    "config": {
      "connector.class": "io.debezium.connector.postgresql.PostgresConnector", 
      "database.hostname": "postgres-source", 
      "database.port": "5432", 
      "database.user": "user", 
      "database.password": "password", 
      "database.dbname" : "db", 
      "database.server.name": "hrdata", 
      "table.include.list": "public.employees",
      ...
    }
  }

sink - Kafka to Database

curl -X POST http://localhost:8083/connectors \
    -H "Content-Type: application/json" \
    --data @connectors/postgresql-sink.json

Like the source file configuration, it exists the same for the sink side postgresql-sink.json. If not exists, the target table will be create.

{
    "name": "postgresql-sink-connector",
    "config": {
        "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
        "connection.url": "jdbc:postgresql://postgres-sink:5432/db?user=user&password=password",
        "topics": "hrdata.public.employees",
        "table.name.format": "employees",
        "insert.mode": "insert",
        "auto.create": true,
    }
}

See more endpoints on Connect REST Interface documentation

4. Manipulate the data

Let's try it out ! Update or insert rows on public.employees table in postgres-source database.

See the differents ways to explore databases of this project

🙎‍♂️ Uses case

See an use case example

⚙️ Others Commands

Docker

Teardown & uninstall

$ docker-compose --project-name kafka-connect-pgsql -f kafka-connect.yml stop
Stopping confluent-connect  ... done
Stopping confluent-registry ... done
Stopping kafka              ... done
Stopping postgres-source    ... done
Stopping zookeeper          ... done
Stopping postgres-sink      ... done
$ docker-compose --project-name kafka-connect-pgsql -f kafka-connect.yml rm
Going to remove postgres, connect, kafka, zookeeper
Are you sure? [yN] y
Removing confluent-connect  ... done
Removing confluent-registry ... done
Removing kafka              ... done
Removing postgres-source    ... done
Removing zookeeper          ... done
Removing postgres-sink      ... done

Kafka

List topics

docker run --net=host --rm \
    wurstmeister/kafka:2.11-2.0.0 sh opt/kafka_2.11-2.0.0/bin/kafka-topics.sh \
        --zookeeper localhost:2181 \
        --list

Read topic

docker run --net=host --rm \
    wurstmeister/kafka:2.11-2.0.0 sh opt/kafka_2.11-2.0.0/bin/kafka-console-consumer.sh \
        --bootstrap-server localhost:9092 \
        --topic hrdata.public.employees \
        --timeout-ms 3000 \
        --from-beginning

Read topic (Avro)

docker run --net=host --rm \
    confluentinc/cp-schema-registry:5.0.0 kafka-avro-console-consumer \
        --bootstrap-server localhost:9092 \
        --topic hrdata.public.employees \
        --timeout-ms 3000 \
        --from-beginning \

🔗 Usefuls links

Resources

Katacoda scenario

Confluent

Debezium

Others

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.