Running some benchmarks of hadoop using teragen/terasort. One of the recommendations I was given was to disable speculative execution. Noticed something rather strange when I forced it to disabled in the config.
Runtime with speculative execution: 18.5 minutes
Runtime without speculative execution: 1 hour
Seems that 2-3 map tasks are taking longer than the rest.
Question now is: why. Each map task is responsible for generating the same % of data - why would speculative execution make the job run quicker. Does this point to hardware differences ( if so, the slow tasks are on different machines - I have not noticed a pattern yet ), configuration problems elsewhere, or just random bad luck.