Giter VIP home page Giter VIP logo

deploydoc's Introduction

关于spark的简单分析以及工作中的一些总结

心血来潮,想spark进行一些深层次的分析,同时也不脱离实践操作,所以有关spark的一些分析,尽量会结合实践附上简单的demo操作。Spark源码采用的是1.2.0版本。 ###简单介绍 作为对工作的总结,这边记录了基本上从零开始如何去做一些spark以及hadoop相关的工作,这里开始会将的很基础,从如何安装编译Spark、Spark如何提交任务开始,然后会对Spark的一些重要特性结合进行分析,其中不乏引用好的Spark相关内容(我会以引用的形式标识出来,如有侵权,请和我联系,我及时修正)。在分析Spark过程中会有很多例子程序,其中会有很多是Spark自带的例子也会有网上看到好的例子程序,我都会给程序赋予详细的注释。同时也会结合我实际的工作,把我遇到的实际问题在这边进行详细分析。 ###主要内容

1.Build and install Spark-安装和编译Spark

2.RDD details-详细介绍RDD的用法以及其实质

3.Job executing and task scheduling-介绍Spark内部如何调度和执行任务

4.Deploy-分析Spark Deploy模块

5.SparkStreaming-sparkStreaming模块源码简单分析

6.Custom receiver-如何自定义SparkStreaming接收器

7.Custom FileInputFormat-分析如何实现自己的InputFormat

8.[Shuffle]-研究Spark Shuffle并和hadoop比较

9.[Spark fault tolerant]-研究Spark血统容错并和hadoop进行比较

10.[Spark-sql]-介绍Spark sql

11.[Spark-Mllib]-介绍Spark Mllib

12.[Spark-Graphx]-介绍Spark Graphx

13.[Tachyon]-介绍Tachyon

###其它问题 在实际工作中,会遇到其他各种问题,这里将一些重要的问题也以文档的形式记录下来。

1.receiver-分析spark源码,修改sparkStreaming模块源码让其支持动态的添加和停止流

2.intellij-使用intellij打包的一些问题。

3.exception-在实践过程中的一些问题总结

4.kafka安装配置-介绍zookeeper和kafka的安装配置

5.opencv安装配置-介绍opencv的安装配置

6.ganglia安装配置-介绍ganglia的安装配置

deploydoc's People

Contributors

gjhkael avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.