See thesis_spark_vs_flink.pdf
- Open Shell on Master:
docker exec -it jobmanager /bin/sh
- Start own WordStream creator to send words and Flink App to consume (cluster needs to be started first):
python3 /Development/WordStreamCreator/word_stream_creator.py
./bin/flink run -c WordCount my_apps/flink-word-count_2.12-1.0.jar --9001
- Start own Batch Application to process BatchWordCount on Flink cluster:
./bin/flink run -c WordCountBatch my_apps/flink-word-count_2.12-1.0.jar --stdout --input input_data/50MiB
- Execute Batch processing of multiple experiements in jobmanager container:
sh start.sh
- View Result from APP:
tail log/flink-*-taskexecutor-*.out
- Go to folder: Environments/FlinkEnv/bin
sh jobmanager.sh start-foreground
- Go to folder: Environments/FlinkEnv/
docker-compose up
http://192.168.178.56:4040/jobs
- Open Shell on Master:
docker exec -it spark-master bash
- Test application from Spark, replace MasterURL and number of tasks:
spark/bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://$(hostname):7077 --num-executors 1 --driver-memory 1g --executor-memory 1g --executor-cores 1 spark/examples/jars/spark-examples*.jar 100
spark/bin/spark-submit --class WordCount --master spark://$(hostname):7077 --num-executors 1 /apps/spark-word-count_2.12-1.0.jar 1000 6
sudo docker build -t pipi/spark:latest .
- Go to folder: Environments/SparkEnv/
docker-compose up spark-master
docker-compose up spark-worker-surface