Hadoop jobs require a high-performance storage solution, and even if most object stores are now compatible with the S3 API, few can sustain the high performance required for Hadoop in respect to the original performance of HDFS.
This 16 pages report demonstrates the high performance of OpenIO Object Storage in a Big Data environment.
Our goal in conducting this benchmark was to compare the performance of HDFS to that of OpenIO. We wanted to validate the integration of OpenIO and Hadoop for MapReduce jobs, using some reference Hadoop benchmarks and a modified version of DFSIO to test IO performance.
We used the following benchmark tools: