This code is discussed at length at machinecreek.com.
This repository contains multiple examples of Spark concepts. Each is posted with the intention of being self contained for immediate copy into an empty directory, opened, compiled, and run in IntelliJ.
The following are prerequisite installations on the development machine.
- Although intelliJ is used in all tutorials; other IDEs are possible but not tested.
- Maven: 3.3.9 is used here. Earlier versions are possible but not tested.
- Scala IntelliJ Plugin 3.0.6 is used here.
- Maven: 3.3.9 is used here. Earlier versions are possible but not tested.
- Spark 2.1.1+
- Scala 2.11.7
- Hadoop/Yarn 2.7.1