This is a great illustration of the MapReduce concept for anyone who tries to understand the algorithm intuitively. I saw it at a Hadoop talk from Salesforce.com.
Basically it's a laundry operation that sorts socks first, then washes them with "like colors", of course. :) The sorting tables basically is the Map step processor, and the washers carry out the Reduce. One important concept is that the Map usually uses a generic processor, which doesn't mind working on any subset of the data; on the other hand the Reduce step is usually data-specific, which in this example means, red washers only run with the red socks. The whole operation is horizontally scalable in a near linear fashion, i.e., just add processing power (people and equipment - table or washer - in this case) to scale up the ability to handle larger volume.
Here's the original talk, with the MapReduce part run by Jed Crosby.
No comments:
Post a Comment