MapReduce

Terms from Artificial Intelligence: humans at the heart of algorithms

MapReduce is a distributed computing framework for delaing with big data. It is inspired by the map and reduce functions found in Lisp and other functional programming languages: the map stage initially processes each item of data separately and then the {[reduce}} stage collects and aggregates related intermediate results. However, MapReduce has two crucial additional features : (i) the use of a hash as means to distribute intermediate results over different computers, so redcuing the likelihood of Byzantine conditions; and (ii) means to monitor for failure of individual computers and recvover from this.

each data itemmap hash + processed data
all data for a hashreduce one or more agggregated calculations for the hashed data
(possibly including fresh has for further reduce steps)
.
MapReduce was initially developed by engineers at Google, but has since become part of the open source Apache Apache Hadoop project.

Used on pages 161, 163, 167, 169, 170, 551

MapReduce distributed computation pipeline