Giter VIP home page Giter VIP logo

spark-scala-word-count's Introduction

SPARK-SCALA-WORD-COUNT

A simplist demo on how to write, compile, export, and run a spark word count job via spark scala with sbt tool

Quick Start

# STEP 0) 
$ git clone https://github.com/yennanliu/spark-scala-word-count.git && cd spark-scala-word-count 

# STEP 1) download the used dependencies.
$ sbt clean compile

# STEP 2) run spark word count via `sbt run`
$ sbt run

# STEP 3) create jars from spark scala scriots 
$ sbt assembly

# STEP 4) run spark word count via `spark submit`
$ spark-submit /Users/$USER/spark-scala-word-count/target/scala-2.11/spark-scala-word-count-assembly-1.0.jar

Quick Start (Docker)

# STEP 0) 
$ git clone https://github.com/yennanliu/spark-scala-word-count.git

# STEP 1) 
$ cd spark-scala-word-count

# STEP 2) docker build 
$ docker build . -t spark_env

# STEP 3) ONE COMMAND : run the docker env and sbt compile and sbt run and assembly once 
$ docker run  --mount \
type=bind,\
source="$(pwd)"/.,\
target=/spark-word-count \
-i -t spark_env \
/bin/bash  -c "cd ../spark-word-count && sbt clean compile && sbt run && sbt assembly && spark-submit /spark-word-count/target/scala-2.11/spark-scala-word-count-assembly-1.0.jar"

# STEP 3') : STEP BY STEP : access docker -> sbt clean compile -> sbt run -> sbt assembly -> spark-submit 
# docker run 
$ docker run  --mount \
type=bind,\
source="$(pwd)"/.,\
target=/spark-word-count \
-i -t spark_env \
/bin/bash 
# inside docker bash 
root@942744030b57:~ cd ../spark-word-count && sbt clean compile && sbt assembly
root@942744030b57:~ spark-submit /spark-word-count/target/scala-2.11/spark-scala-word-count-assembly-1.0.jar

Reference

Todo

  • Auto commit built jar to S3/github/slack
  • Auto run spark jar at cloud

spark-scala-word-count's People

Contributors

yennanliu avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.