Giter VIP home page Giter VIP logo

mapreduceimplementation's Introduction

MapReduceImplementation

Implementing Map reduce paper in Golang. Dealing with race conditions, worker failures.

In this repo, I am trying to implement the word count problem using Map reduce approach.

  • Initially master reads the text files provided at run time.
  • After the master starts, if any worker thread requests for map or reduce task, it checks its available(not yet started) map or reduce tasks.
  • Master wouldn't be handing out the reduce tasks unless all the map tasks are complete.
  • Hence if worker accidentally requests reduce task before map task, then it still receives a map task, which we keep track via custom Task struct.

For map tasks:

  • Each worker would be given a unique ID at init. Each map task has to be divided into 'nReduce' number of reduce tasks.
  • Worker applies map function to the content read from the file it receives from master and writes the intermediate output of format (, 1) to intermediate file which are named as mr-workerID-reduceID.
  • Once the task is initiated, the master sleeps for 10 seconds and checks if the custom struct "Task"'s status changes to complete.
  • If not, the task is again marked as incomplete and would be later given to next worker.
  • If the master receives ACK within 10 seconds, it marks the task as complete and creates Task structs for all the intermediate files.
  • All these newly created Tasks are added to reduce queue of master.
  • Once all map tasks are done, reduce stage starts.

For reduce tasks:

Since single file is processed by single worker, we group the reduce tasks by workerID.

Final done() stage

Once the reduce queue in the master is empty, the master exits.

To do

Check if worker's ACK is received 10 secs later.

To run:

Open atleast 2 terminal tabs, one for coordinator.go & rest for different workers(worker.go)
go build -race -buildmode=plugin ../mrapps/wc.go
go run -race mrcoordinator.go pg-*.txt (To start master)
go run -race mrworker.go wc.so (To start single worker)

To test:

bash test-mr.sh

mapreduceimplementation's People

Contributors

nancyp321 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.