Giter VIP home page Giter VIP logo

twitter-data-partioning-using-map-reduce-and-partitioner's Introduction

Twitter-data-partioning-using-MAP-REDUCE-AND-PARTITIONER

Goal :-

TO display count of every hastag only in tweet info associated with given keyword.

Methodology:-

Mapper:- this is the initial phase in the execution. In which every data node will be accessing data and every node will be having computation logic which was given by main.
These data nodes process the data, format or split based on key,value pairs.What will be key or value will be decided by main or can also be decided by users.
Reducer :- This is the last step in which key values obtained from mapper are combined here. The combination is also done in such a way that for each key all values in iterator mode are added and returned.

How is this implemented here:?-

We all know a famous problem that can be solved using map reduce is word count problem. Here in a file or folder word with their frequencies are given as output that are occurring entirely all over.Aim of this part is to calculate total rows in a dataset.So the usual wordcount program can be developed to satisfy our aim . Instead of words we should make the mapper to take the whole json object or tweet row as a key. And the value for every json object associated with the key is 1. Because these objects are tweets and in a twitter data there cannot be repeated tweets. As tweets are non repeating, we can fix only value 1 with every tweet in mapper.
Now in reducer, all these 1s are added and returns total count. These total count ensures total tweets in a folder as we are adding all 1s associated as values with key rows.

Steps of execution :-

  1. First, set up your hadoop.
  2. Next in partioner code change according to your keywords in quotes.
  3. run commands according to your compiling environment.

Conclusion :-

Depending on number of keywords changed in the code that many partioned files will be created in which count of hashtags are generated.

twitter-data-partioning-using-map-reduce-and-partitioner's People

Contributors

srinathsai avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.