Repo showing test code for integration with Spark.
This demo will feed the contents of a file (line-by-line) into a given Algorithmia algorithm, group and count the output of the algorithm using Apache Spark.
- Have spark installed either locally or on a cluster (see: http://spark.apache.org/)
- Edit src/main/scala/algorithmia/spark/Main.scala to set:
- ALGORITHMIA_API_KEY : Your Algorithmia API key
- ALGORITHMIA_ALGO_NAME : Algorithm name you'd like to call
- SPARK_HOSTNAME : Connection string for your spark instance
- INPUT_FILE_NAME : Name of the file to use as input
- NUM_PARTITIONS (optionally) : Number of partitions of the file spark should run
- Perform an "sbt run" from the top level of the project