Giter VIP home page Giter VIP logo

asaduddin11's Introduction

WordCount-using-Hadoop-MapReduce

WordCount program code using RStudio.

The WordCount.java file when executed performs MapReduce funtion on large amounts of text. you can install and configure Hadoop in your machine, as long as it runs Linux or OSX. You can find Hadoop installation instructions for your computer at: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

You can use either an IDE (Eclipse, Netbeans), or your favorite code editor to create the Hadoop classes. However, the project must be built and packaged using the Ant script. The following Youtube video provides a step by step guide on configuring Hadoop for Eclipse and Ant in the ITL environment: https://qmplus.qmul.ac.uk/mod/url/view.php?id=710623

Execution: Create a root folder (lets name it Bigdata for example) In Bigdata folder create another src/folder Store all the java files in the src/folder. Create an input folder and store the input file(Sherlock.txt) in this. Include the build.xml file in your root folder. This file is customized. You will need to update the hadoop.base.path, hadoop.version and hadoop.core.file property defined at the beginning of the file.

Open Terminal. In the root directory. bash:ant clean dist (For a complete reference on the ant building system you can check http://ant.apache.org) Upon successful execution, it creates a dist folder in the root folder which contains the WordCount.jar file.

bash: hadoop-local jar /dist/WordCount.jar WordCount input out If successful, it creates an out folder in the root folder. The out folder contains 2 files. part-r-00000 and _SUCCESS

The output of the MapReduce program is stored in the part-r-00000.

asaduddin11's People

Contributors

kasipavankumarqvttek avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.