In the process of installing a new compute cluster at work, I evaluated several compilers and was surprised at how well Intel's compiler (icc) performs. This was made apparent by some of the results I obtained from the Scimark2 benchmark, as shown below.
The graph shows the MFLOPS measured by each of the Scimark2 computational kernels on three different machines:
- A Dell PowerEdge 6650 with 4 Intel Xeon MPs (32-bit, hyperthreaded, 2.0-GHz);
- A Sun X200 M2 with 2 AMD Opteron 2200s (64-bit, dual core, 2.8-GHz); and
- A Dell PowerEdge 2950 with 2 Intel Xeon 5130s (64-bit, dual core, 2.0-GHz)
The 32-bit version of Scimark2 was compiled using version 10.0 of the Intel C++ compiler, while the 64-bit version was compiled using gcc 3.4.6 (the default compiler with the Rocks 4.3/CentOS 4.5).
Not surprisingly, the 64-bit machines perform better on the whole, as evidenced by the scores for the "Composite", "FFT" (fast Fourier transform), "SOR" (Jacobi successive over-relaxation), and "LU" (dense matrix LU factorization) computational kernels. What is surprising, however, is that the five year old PowerEdge 6650 using icc actually managed to outperform the shiny new 64-bit machines using gcc. I haven't looked in detail as to why this happens. However, based on the descriptions of the benchmark kernels, I suspect much of the performance gain comes from the Intel compiler being much better at managing certain types of memory operations.
No comments:
Post a Comment