Comments on: NVIDIA A40 48GB GPU Mini-Review https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/ Server and Workstation Reviews Sun, 20 Mar 2022 18:05:31 +0000 hourly 1 https://wordpress.org/?v=5.9.3 By: Matt https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/#comment-481493 Sun, 20 Mar 2022 18:05:31 +0000 https://www.servethehome.com/?p=59795#comment-481493 @steffen It does use dedicated silicon as NVIDIA architects it now. If I remember, in Kepler FP64 also used dedicated silicon. They cut the FP64 rate in firmware for differentiation purposes, as did AMD. In the Tesla (architecture, not product line) generation, and maybe in Fermi, NVIDIA had FP32 units built into their FP64 units and so didn’t use as much extra die area for the FP64. However, I believe that’s less energy efficient and that’s why they changed it. Or perhaps they changed it for some other reason, such as easier optimization or ability to reach higher clock speeds. Regardless, AMD followed suit. Also, the fact that AMD produced the Radeon VII (in extremely low quantities) has no bearing on the fundamental physics. AMD was looking for markets for chips. The card sucked for most use cases and, as noted, was produced in very low quantities. If anything it can be used as evidence that use case differentiation in the GPU space is very real. There’s so much “conspiracy theory” online when it comes to Nvidia. It’s tiresome.

]]>
By: Steffen https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/#comment-481473 Sun, 20 Mar 2022 07:35:26 +0000 https://www.servethehome.com/?p=59795#comment-481473 @Matt I don’t think OP was thinking that FP64 uses dedicated silicon, and I don’t think so either. In the earlier days of Quadro (think Kepler) the Titan used the same chip, but had FP64 cut by 2/3.

]]>
By: Matt https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/#comment-481457 Sat, 19 Mar 2022 17:00:56 +0000 https://www.servethehome.com/?p=59795#comment-481457 Ironically, you seem to think you are complaining about too much market segmentation but in fact you are complaining about not enough. Additionally, more segmentation of the type you are looking for would result in more expensive parts, or simply the market not being served at all. Most GPU domains do not require FP64 operations. The A40 is based on the GA102 GPU which is the top-of-the-line graphics/gaming oriented GPU from NVIDIA. By including full FP64 on it, it would lose performance in its core markets: gaming and professional graphics. If you want FP64 on a data center NVIDIA GPU less powerful than an A100 then you have to go for the A30 or for an older generation Gx100 GPU. If you want both ray tracing and full FP64 on the same GPU then you must either go to another manufacturer or convince NVIDIA that the size of your market is large enough to either 1) warrant NVIDIA spending hundreds of millions of dollars to make a special GPU for the market or 2) warrant NVIDIA spending money to add ray tracing to all data center GPUs and reducing the CUDA performance somewhat in the process due to the loss of die area and sacrifices that must be made to optimize for a greater set of constraints.

]]>
By: hoohoo https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/#comment-481456 Sat, 19 Mar 2022 15:26:48 +0000 https://www.servethehome.com/?p=59795#comment-481456 What your pair of bar charts are supposed to mean?

Are you comparing the A40 cards on hand to past A40s you’ve tested? Are you comparing the same cards running in different servers (ie the charts are actually server comparisons?)

Anyway, A40 (A10, A6000) have negligible FP64 ability but superior FP32. A100 is an FP64 monster with relatively modest FP32.

]]>
By: Eric Olson https://www.servethehome.com/nvidia-a40-48gb-gpu-mini-review/#comment-481440 Sat, 19 Mar 2022 04:29:56 +0000 https://www.servethehome.com/?p=59795#comment-481440 This GPU is rated 0.5 TFlops double precision with single precision an astonishing 64 times faster than double.

For reference a lowly Radeon VII yields 3.3 TFlops of double precision which is about 6 times faster than the A40.

In my opinion the level of market segmentation that Nvidia is practicing with their GPUs right now can’t be good for anyone but Nvidia.

Since the main advantage GPGPU computing has over special-purpose neural network hardware is that GP stands for general purpose, it would be useful to test a comprehensive selection of general application domains in the GPU reviews. Such tests are especially important for GPUs with intentionally lopsided capabilities that target very specific market segments.

]]>