Thursday, 11 April 2013

INTRODUCTION TO MAPREDUCE | Hadoop Tutorial pdf

INTRODUCTION TO MAPREDUCE
MapReduce is a programming model and an associated implementation for processing and generating largedata sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model.
This abstraction is inspired by the map and reduce primitives present in Lisp and many other functional languages. We realized that most of our computations involved applying a map operation to each logical .record. in our input in order to compute a set of intermediate key/value pairs, and then applying a reduce operation to all the values that shared the same key, in order to combine the derived data appropriately. Our use of a functional model with user specilized map and reduce operations allows us to parallelize large computations easily and to use re-execution as the primary mechanism for fault tolerance.

No comments: