Pet project to experiment with kafka, flink & cassandra
To be in control over the incomning data I've setup a websocket server that reads from a huge CSV file and post each line in regular intervals to it's clients
Run:
sbt "project websocket-server" "run"
Start Kafka on Docker with zookeeper
docker run -d -p 2181:2181 -p 9092:9092 --env ADVERTISED_HOST=127.0.0.1 --env ADVERTISED_PORT=9092 --name kafka johnnypark/kafka-zookeeper
ADVERTISTED_HOST
was set to 127.0.0.1
, which will allow other containers to be able to run Producers and Consumers.
Setting ADVERTISED_HOST
to localhost
, 127.0.0.1
, or 0.0.0.0
will work great only if Producers and Consumers are started within the kafka
container itself, or if you are using DockerForMac (like me) and you want to run Producers and Consumers from OSX.
When websocket server & kafka are running we can run the websocket client / kafka producer
sbt "project websocket-client-kafka" "run"
`docker run --name cassandra -d cassandra
`
Have a kafka consumer to Flink read from the kafka stream and *process the data *store the results in cassandra