Giter VIP home page Giter VIP logo

hadoop-on-docker's Introduction

Docker-on-Hadoop

This repo will help you how to install hadoop on docker container

Pre-requisite:

-> Git
-> Docker

Steps to follow:

By Following these steps you will able to setup the hadoop setup on docker container

Step 1: Clone the "docker-hadoop" repository from GitHub using the following command:
git clone https://github.com/gopalkumr/Hadoop-on-Docker.git

Step 2: cd Hadoop-on-Docker

Step 3: docker-compose up -d

Step 4: docker container ls

Step 5: docker exec -it namenode /bin/bash

Running Hadoop Code:

Step 1: Copy the code folder on docker conatiner by running this command on the terminal (opened in the folder where you have cloned the repo):

docker cp code namenode:/

Step 2: Then go into Hadoop_Code directory and further into input directory from where you have to copy the data.txt file

Step 3: Create some directories in hadoop file system by following command:
-> hdfs dfs -mkdir /user
-> hdfs dfs -mkdir /user/root
-> hdfs dfs -mkdir /user/root/input

Step 4: Copy the data.txt to the input directory (user/root/input) created in hadoop file system by following command:
-> hdfs dfs -put data.txt /user/root/input

Step 5: Return back to directory where wordCount.jar file is located:
-> cd ../

Step 6: Then execute the jar file by following command:
-> hadoop jar wordCount.jar org.apache.hadoop.examples.WordCount input output

Step 7: Display the output usind this command:
-> hdfs dfs -cat /user/root/output/*

hadoop-on-docker's People

Contributors

gopalkumr avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.