Comments on: NVIDIA Grace-2xHopper Supercomputer Building Block at GTC 2022

By: Will

Will — Mon, 04 Apr 2022 22:10:25 +0000

I’ve used ICC myself and yes it does absolute magic on SPEC (especially once you realise what transformations it actually does). But it is disappointing for anything else, I got about 5% slower code on my application. Phoronix recently tested AOCC: https://www.phoronix.com/scan.php?page=article&item=aocc32-clang-gcc&num=1

The official 2P 7763 SPECINT score is 913 using AOCC, AnandTech got 537 using GCC. So AOCC is clearly 70% faster than GCC, right? So how is it possible that GCC actually won most of the tests and for the tests with the largest spread AOCC was the slowest compiler? AOCC is certainly lacking some optimizations for code that doesn’t look like SPEC sources…

HPC people generally use different compilers and choose the fastest one for their code. However there isn’t a single best compiler, and ICC/AOCC are certainly not the only compilers they use.

As for AnandTech, I’m not sure what you have against them. They are one of the few tech publications that use fair, meaningful and unbiased benchmarks.

By: Patrick Kennedy

Patrick Kennedy — Sat, 02 Apr 2022 18:42:29 +0000

In reply to Will.

Will – SPEC scores, by definition, are what is published on the spec.org website. Anything else is a different test.

And 100% both ICC and AOCC are used, in production, with commercial ISVs, and they yield a performance gain.

GCC is still valid as a least common denominator compiler, but that is then not valuing the optimization on the software side. It is not that GCC is invalid, but if we are looking at maximum performance it is hard to only say GCC is the only one.

And on being unbiased, again, you cited AT, but I am not currently trying to get a job at an Arm CPU vendor.

By: Will

Will — Sat, 02 Apr 2022 15:29:54 +0000

@Patrick: Saying Grace is already outclassed is not being neutral when you use inflated SPEC scores to get there. AnandTech does fair SPEC comparisons using the same compiler and options, and they show 540 for 2P Altra Max vs 537 for 2P EPYC 7763: https://images.anandtech.com/graphs/graph16979/122609.png

740 SPECINT is a 37% gain over Altra Max/EPYC, so that will definitely be competitive early 2023.

Note also that Grace has 2.5x the memory bandwidth of Milan(-X), and that is what really matters.

By: Patrick Kennedy

Patrick Kennedy — Sat, 02 Apr 2022 05:46:31 +0000

In reply to Will. Will - The applications that AMD is targeting for Milan-X are ones that do not use GCC. AOCC/ ICC do give meaningful performance gains and that is why commercial vendors often use them. Both numbers compare different things, but the actual SPEC number is what is published on the website. I am not trying to get hired by an Arm CPU vendor so we are a bit more neutral in the opinion on these things.

By: Will

Will — Fri, 01 Apr 2022 21:56:58 +0000

The “official” SPEC scores are fantasy numbers due to ICC and AOCC doing special tricks that only work on SPEC source code. They don’t indicate realworld performance in any meaningful way. Neither ICC nor AOCC give significant speedups on general code over GCC or LLVM. They really only exist to give good SPEC scores, and that’s it.

Comparing a trick compiler with a standard compiler like GCC is misleading at best. The only fair comparison between different CPUs is to use the same compiler and same options. And that is what NVIDIA seems to have done here. You can find similar SPEC comparisons on AnandTech.

@Patrick: So what is the SPEC score for Milan-X using GCC with the options NVIDIA used? That’s the only useful comparison in this context, not the 840 fantasy number.

By: Matt

Matt — Fri, 25 Mar 2022 14:51:22 +0000

@xzbit That’s not true. When you see “estimated” next to specint score that means it’s not submitted and official. It’s quite common. In fact you will see it very commonly on reviews for processors. When NVIDIA use gcc 10 instead of aocc for its Epyc specint scores then they are not comparing its projections to the submitted scores, but rather to scores representing a different context than the official scores are meant to represent. And there are two possible reasons for that: either 1) it’s inappropriate to compare it to the official results because it would not draw a proper comparison with their chip or 2) NVIDIA are being shady and making the competitor’s offerings look much, much worse than they really are.

Anyway, to suggest that it’s a “fantasy number” to use different settings from what was submitted to specint is off the wall, even if there were no such thing as optimized compilers to deal with. Isn’t it true that much of the code running on AMD Epyc CPUs was not compiled with AOCC? Then the AOCC scores mean very little to the performance of that specific code. Then how can it be fantasy to consider Epyc CPUs’ integer performance with compilers and settings that aren’t optimized for the specint test?

There are good reasons for maintaining an official specint score database using best results. And there are also good reasons for considering scores that use compilers and settings other than what give those best results. It’s a little like using wrist straps or not using wrist straps for a deadlift. Even if there were some official repository of deadlifting results that uses wrist straps, whereas wrist straps result in higher weights, it’s still useful to consider deadlifting results without wrist straps as that tests a someone different context of strength. Such results are not “fantasy”.