Please run all these commands inside foobar-kafka directory
#Create docker networks
$ docker network create kafka-network # create a new docker network for kafka cluster (zookeeper, broker, kafka-manager services, and kafka connect sink services)
$ docker network create cassandra-network # create a new docker network for cassandra. (kafka connect will exist on this network as well in addition to kafka-network)
Cassandra is setup so it runs keyspace and schema creation scripts at first setup so it is ready to use.
$ docker-compose -f cassandra/docker-compose.yml up -d
$ docker-compose -f kafka/docker-compose.yml up -d # start single zookeeper, broker, kafka-manager and kafka-connect services
$ docker ps -a # sanity check to make sure services are up: kafka_broker_1, kafka-manager, zookeeper, kafka-connect service
Note: Kafka front end is available at http://localhost:9000
Kafka-Connect REST interface is available at http://localhost:8083
$ docker-compose -f owm-producer/docker-compose.yml up -d # start the producer that retrieves open weather map
$ docker-compose -f twitter-producer/docker-compose.yml up -d # start the producer for twitter
(Alternatively you can build first, docker will do it anyway if it hasnt built it before, this step is required to apply new changes)
$ docker-compose -f consumers/docker-compose.yml build
Start consumers:
$ docker-compose -f consumers/docker-compose.yml up -d # start the consumers
$ docker ps -a # sanity check to make sure services are up: kafka_broker_1, kafka-manager, zookeeper, kafka-connect service
To stop all running kakfa cluster services
$ docker-compose -f consumers/docker-compose.yml down # stop the consumers
$ docker-compose -f owm-producer/docker-compose.yml down # stop open weather map producer
$ docker-compose -f twitter-producer/docker-compose.yml down # stop twitter producer
$ docker-compose -f kafka/docker-compose.yml down # stop zookeeper, broker, kafka-manager and kafka-connect services
$ docker-compose -f cassandra/docker-compose.yml down # stop Cassandra
To remove the kafka-network network:
$ docker network rm kafka-network
$ docker network rm cassandra-network
First login into Cassandra's container with the following command or open a new CLI from Docker Desktop if you use that.
$ docker exec -it cassandra bash
Once loged in, bring up cqlsh with this command and query twitterdata and weatherreport tables like this:
$ cqlsh --cqlversion=3.4.4 127.0.0.1 #make sure you use the correct cqlversion
cqlsh> use kafkapipeline; #keyspace name
cqlsh:kafkapipeline> select * from twitterdata;
cqlsh:kafkapipeline> select * from weatherreport;
And that's it! you should be seeing records coming in to Cassandra. Feel free to play around with it by bringing down containers and then up again to see the magic of fault tolerance!
To load backup CSV files into Cassandra, from a console run the following:
$ python consumers/python/cassandrautils.py twitter {PATH_TO_twitter.csv}
$ python consumers/python/cassandrautils.py weather {PATH_TO_weather.csv}
How can I connect to a running container?
docker exec -it <container_name>