Giter VIP home page Giter VIP logo

software8899 / kafka-streams-machine-learning-examples Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kaiwaehner/kafka-streams-machine-learning-examples

0.0 2.0 0.0 52.57 MB

This project contains examples which demonstrate how to deploy analytic models to mission-critical, scalable production environments leveraging Apache Kafka and its Streams API. Models are built with Python, H2O, TensorFlow, DeepLearning4 and other technologies.

License: Apache License 2.0

Java 100.00%

kafka-streams-machine-learning-examples's Introduction

Machine Learning + Kafka Streams Examples

This project contains examples which demonstrate how to deploy analytic models to mission-critical, scalable production leveraging Apache Kafka and its Streams API. Examples will include analytic models built with TensorFlow, Keras, H2O, Python, DeepLearning4J and other technologies.

Installation and Usage

Java 8 and Maven 3 are required. Maven will download all required dependencies.

Just download the project and run 'mvn clean package'.

Every examples includes an implementation and an unit test. The examples are very simple and lightweight. No further configuration is needed to build and run it. Though, for this reason, the generated models are also included (and increase the download size of the project).

The unit tests use some Kafka helper classes like EmbeddedSingleNodeKafkaCluster in package "com.github.megachucky.kafka.streams.machinelearning.test.utils". If you want to run an implementation of a main class, you need to start a Kafka cluster (with at least one Zookeeper and one Kafka broker running) and also create the required topics.

Use Cases and Technologies

The following examples are already available including unit tests:

  • Deployment of a H2O GBM model to a Kafka Streams application for prediction of flight delays
  • Deployment of a H2O Deep Learning model to a Kafka Streams application for prediction of flight delays
  • Deployment of a pre-built TensorFlow CNN model for image recognition
  • Deployment of a DL4J model to predict the species of Iris flowers

More sophisticated use cases around Kafka Streams and other technologies will be added over time. Some ideas:

  • Image Recognition with H2O and TensorFlow (to show the difference of using H2O instead of using just low level TensorFlow APIs)
  • Anomaly Detection with Autoencoders leveraging DeepLearning4J.
  • Cross Selling and Customer Churn Detection using classical Machine Learning algorithms but also Deep Learning
  • Stateful Stream Processing to combine different model execution steps into a more powerful workflow instead of "just" inferencing single events (a good example might be a streaming process with sliding or session windows).
  • Keras to build different models with Python, TensorFlow, Theano and other Deep Learning frameworks under the hood + Kafka Streams as generic Machine Learning infrastructure to deploy, execute and monitor these different models.

Example 1 - Gradient Boosting with H2O.ai for Prediction of Flight Delays

Use Case

Gradient Boosting Method (GBM) to predict flight delays. A H2O generated GBM Java model (POJO) is instantiated and used in a Kafka Streams application to do interference on new events.

Machine Learning Technology

  • H2O
  • Check the H2O demo to understand the test and and how the model was built
  • You can re-use the generated Java model attached to this project (gbm_pojo_test.java) or build your own model using R, Python, Flow UI or any other technologies supported by H2O framework.

Source Code

MachineLearning_H2O_Example.java

Unit Test

MachineLearning_H2O_Example_IntegrationTest.java

The project includes another example with similar code to use a H2O Deep Learning model instead of H2O GBM Model: Kafka_Streams_MachineLearning_H2O_DeepLearning_Example_IntegrationTest.java This shows how you can easily test or replace different analytic models for one use case, or even use them for A/B testing.

Example 2 - Convolutional Neural Network (CNN) with TensorFlow for Image Recognition

Use Case

Convolutional Neural Network (CNN) to for image recognition. A prebuilt TensorFlow CNN model is instantiated and used in a Kafka Streams application to do recognize new JPEG images. A Kafka Input Topic receives the location of a new images (another option would be to send the image in the Kafka message instead of just a link to it), infers the content of the picture via the TensorFlow model, and sends the result to a Kafka Output Topic.

Machine Learning Technology

  • TensorFlow
  • Leverages TensorFlow for Java. These APIs are particularly well-suited for loading models created in Python and executing them within a Java application. Please note: The Java API doesn't yet include convenience functions (which you might know from Keras), thus a private helper class is used in the example for construction and execution of the pre-built TensorFlow model.
  • Check the official TensorFlow demo LabelImage to understand this image recognition example
  • You can re-use the pre-trained TensorFlow model attached to this project tensorflow_inception_graph.pb or add your own model.
  • The 'images' folder contains models which were used for training the model (trained_airplane_1.jpg, trained_airplane_2.jpg, trained_butterfly.jpg) but also a new picture (new_airplane.jpg) which is not known by the model and using a different resolution than the others. Feel free to add your own pictures (they need to be trained, see list of trained pictures in the file: imagenet_comp_graph_label_strings.txt), otherwise the model will return 'unknown'.

Source Code

Kafka_Streams_TensorFlow_Image_Recognition_Example.java

Unit Test

Kafka_Streams_TensorFlow_Image_Recognition_Example_IntegrationTest.java

Example 3 - Iris Prediction using a Neural Network with DeepLearning4J (DL4J)

Use Case

Iris Species Prediction using a Neural Network. This is a famous example: Prediction of the Iris Species - implemented with many different ML algorithms. Here I use DeepLearning4J (DL4J) to build a neural network using Iris Dataset.

Machine Learning Technology

Unit Test

Kafka_Streams_MachineLearning_DL4J_DeepLearning_Iris_IntegrationTest.java

kafka-streams-machine-learning-examples's People

Contributors

kaiwaehner avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.