Comments on: NVIDIA GTC 2022 Keynote Coverage Crazy New Data Center Gear https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/ Server and Workstation Reviews Wed, 20 Apr 2022 06:23:50 +0000 hourly 1 https://wordpress.org/?v=5.9.3 By: Honcho https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-482528 Wed, 20 Apr 2022 06:23:50 +0000 https://www.servethehome.com/?p=59964#comment-482528 Actually, A100 has 9.7TF of FP64 non tensor, so 30TF (which was available at the whitepaper at presentation of Hopper) is 3X from A100

]]>
By: Patrick Kennedy https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-481596 Wed, 23 Mar 2022 15:31:59 +0000 https://www.servethehome.com/?p=59964#comment-481596 TS – updated with how NVIDIA came up with 4.9TB/s

]]>
By: TS https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-481595 Wed, 23 Mar 2022 15:14:59 +0000 https://www.servethehome.com/?p=59964#comment-481595 Moore’s Law is dead just leaked the DP number:

*Vector* Double Precision = 30TF! 50% Higher than A100, which now is the right number corresponding to exactly the 50% higher HBM3 bandwidth

*Matrix* Double Precision = 60TF, basically the same Tensor Fake Teraflops you should ignore

It’s sad that Nvidia is using 3TB+0.9TB+0.9TB+0.128TB to get to the 4.9TB, looks like H100 would be a transformer-only solution for now.

]]>
By: TS https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-481584 Wed, 23 Mar 2022 04:44:46 +0000 https://www.servethehome.com/?p=59964#comment-481584 @Patrick:

Thought about it for about half a day, and realized the following:

The chip was probably designed for 4.9TB/s max(6 stacks of 819GB/sec HBM3 specification), but nvidia probably couldn’t fit 6 stacks of 819GB/s HBM3 into the 700W SXM5 socket TDP limitation, so nvidia took the easy way out and got 6 stacks at 600GB/s with 1 stack either disabled or for “RAID5” purposes for a total of 3TB/s usable.

The real question is this: with only 50% more memory bandwidth than a100, how did Nvidia manage to fit 60TF of Double Precision(3x boost compared to a100, and the rest of the specs are also across the board 3x?) The DP 3x boost is questionable, because it couldn’t be cheated on like having TF32 format reducing 32bits to 19bits.

Another thing is this: if the 700W SXM5 could only do 3TB/s, the 350w PCIe will probably do 2TB/s, and would have the same memory bandwidth as SXM4 A100s, and how much cut in performance will that incur?

I think Nvidia’s H100 chip is design ready at 4.9TB/s, HBM3 wasn’t, and we will probably have to wait for a HBM3 die shrink for both capacity and TDP improvement before we will see 4.9TB/s H100 SXM5 to land? I mean 3TB/s and 4.9TB/s is a generational leap.

]]>
By: Patrick Kennedy https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-481572 Tue, 22 Mar 2022 19:10:31 +0000 https://www.servethehome.com/?p=59964#comment-481572 In reply to TS.

Have to wait a bit to see what the actual parts come in at.

]]>
By: TS https://www.servethehome.com/nvidia-gtc-2022-keynote-coverage-crazy-new-data-center-gear/#comment-481569 Tue, 22 Mar 2022 18:37:43 +0000 https://www.servethehome.com/?p=59964#comment-481569 Conflicting information during the GTC presentation:

Is the H100 4.9TB/s memory bandwidth or 3TB/s(3*8=24TB/s)? According to Anandtech, it is 3TB/s, which is a bummer if true, only 50% higher than A100

]]>