Giter VIP home page Giter VIP logo

sparkscalalearning's Introduction

Windows 10 Setup

Install JDK (java development kit)

  • Download Java SE Development Kit 8u102

Install Spark

  • spark.apache.org - Spark 2.3.3.tgz
  • Extract and copy to C:\spark_2.3.3
  • Modify conf\log4j.properties.template
    • rename to log4j.properties
    • log4j.rootCategory=ERROR
  • WinRAR x64 version

winutils

  • sundog-spark.s3.amazonaws.com/winutils.exe
    • Copy to C:\winutils\bin

Setup Environment Variables

  • Users variables
    • SPARK_HOME: C:\spark_2.3.3
    • JAVA_HOME: C:\Program Files\Java\jdk1.8.0_102
    • HADOOP_HOME: C:\winutils
    • Path
      • Edit
        • New: %SPARK_HOME%\bin
        • New: %JAVA_HOME%\bin

scala IDE

  • scala-ide.org
    • Eclipse

Local Test

  • Command Window
    • c:spark_2.3.3
      • spark-shell
      • val rdd=sc.textFile("README.md")
      • rdd.count()

Scala Project

Get Movie Data

  • grouplens.org
    • dataset
      • ml-100k.zip (MovieLens 100K Dataset)
      • u.data contains all data

Get Code Example

  • media.sundog-soft.com/SparkScala/SparkScala.zip

Eclispe

  • New Scala Project
    • New (Java) Package
      • Name: com.sundogsoftware.spark
    • Import (src code)
      • General
        • File System
          • Spark Scala from above
            • RatingsCounter.scala
    • Project Properties
      • Java Build Path
        • Library
          • Add External JARs
            • C:\spark_2.3.3 \jars
    • Fix incompatible version of scala(2.11.11) vs spark(2.3.3)
      • Project Properties
        • Scala Compiler
          • use project settings
          • Fixed Scala Installation: 2.11.11 (built-in)
    • Run
      • Run Configurations
        • Scala Application
          • Name: RatingsCounter
          • Main class: =package name + scala object name: com.sundogsoftware.spark.RatingsCounter

Using Cluster Manager

Eclipse

  • package
    • Export
      • Java JAR file
        • save into any location

Command Window

  • Go to where you saved the jar file location
    • spark-submit --class com.sundogsoftware.spark.RatingsCounter RatingsCounter.jar

sparkscalalearning's People

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.