Giter VIP home page Giter VIP logo

starterhadoopmapreduce's Introduction

starterHadoopMapReduce

Walking through basics of how to program MapReduce in Hadoop 2.x

Steps:

0 - Get your Apache Hadoop 2.x running on your local machine, preferably 2.3.0. 1 - Configure your environment variables; for example, for a Mac OS X host,

export MAVEN_OPTS="-Xms1024m -Xmx4096m -Djava.awt.headless=true" export HADOOP_VERSION=2.3.0 export HADOOP_PREFIX=/opt/hadoop-${HADOOP_VERSION} export HADOOP_HOME=${HADOOP_PREFIX} export HADOOP_MAPRED_HOME=${HADOOP_PREFIX} export HADOOP_COMMON_HOME=${HADOOP_PREFIX} export HADOOP_HDFS_HOME=${HADOOP_PREFIX} export YARN_HOME=${HADOOP_PREFIX} export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop export HADOOP_OPTS="-server -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm=OX.AC.UK -Djava.security.krb5.kdc=kdc0.ox.ac.uk:kdc1.ox.ac.uk -Djava.security.krb5.conf=/dev/null -Dlog4j.configuration=file:${HADOOP_PREFIX}/etc/hadoop/log4j.properties -Djava.library.path=${HADOOP_PREFIX}/lib" export CLASSPATH=.:${HADOOP_PREFIX}/etc/hadoop:$(find "${HADOOP_PREFIX}" -maxdepth 1 -name '.jar' |xargs echo |tr ' ' ':'):$(find "${HADOOP_PREFIX}/lib" -maxdepth 1 -name '.jar' |xargs echo |tr ' ' ':') export PATH=${HADOOP_PREFIX}/sbin:${HADOOP_PREFIX}/bin:${PATH}

2 - Download several books from http://www.gutenberg.org; for example,

http://www.gutenberg.org/ebooks/39064 http://www.gutenberg.org/ebooks/3207

and store those files in your HDFS, e.g., in /tmp/books by uploading using the following command:

hadoop fs -copyFromLocal /dir/book-1.txt /tmp/books hadoop fs -copyFromLocal /dir/book-2.txt /tmp/books

3 - cd into starterHadoopMapReduce 4 - vi pom.xml

hadoop fs -cat /tmp/rmarano/20140405-2258/part-r-00000

CONGRATULATIONS! You have run your first map-reduce program on Hadoop.

/rob

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.