Giter VIP home page Giter VIP logo

net.jgp.books.spark.ch99's Introduction

The examples in this repository are support to the Spark in Action, 2nd edition book by Jean-Georges Perrin and published by Manning. Find out more about the book on Manning's website.

Spark in Action, 2nd edition - chapter 99

Welcome to Spark in Action, 2nd edition, chapter 99. This chapter is about all the stuff that we'd love to have in the book, but we could not because it is already more than 600 pages.

This code is designed to work with Apache Spark v3.0.0.

Data quality labs

Data quality labs are located in the dq sub package.

Lab #200

This lab mixes machine learning and data quality to predict the revenues of a party of 40 people at a restaurant.

Covid19 labs

Located in the covid19 package.

Data

The data being ingested for those labs is coming from the Center for Systems Science and Engineering (CSSE), part of the Whiting School of Engineering of Johns Hopkins University (JHU). The data is share on GitHub at https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data.

Lab #100 Ingestion

Simple data ingestion,

Other stuff

Located in the misc package.

Lab #9xx

Bunch of stuff in progress, please ignore.

Data

Lots of datasets in this repo, which will be cleaned soon!

Notes:

  1. This repository only contains Java examples.

Follow me on Twitter to get updates about the book and Apache Spark: @jgperrin. Join the book's community on Facebook or in Manning's live site.

net.jgp.books.spark.ch99's People

Contributors

jgperrin avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.