BeatLeJuce t1_j4p938f wrote on January 17, 2023 at 8:13 AM

Reply to comment by royalemate357 in [D] Tim Dettmers' GPU advice blog updated for 4000 series by init__27

Thanks for the explanation? Why call it TF32 when it apperas to have 19 bits? (IIUC it's bfloat16 with 3 additional bits of mantissa?)

royalemate357 t1_j4qdfwj wrote on January 17, 2023 at 3:15 PM

Tbh I don't think it's an especially good name, but I believe the answer to your question is that it actually uses 32 bits to store a TF32 value in memory. its just that when they pass it into tensor cores to do matmuls, they temporarily downcast it to this 19-bit precision format.

>Dot product computation, which forms the building block for both matrix multiplies and convolutions, rounds FP32 inputs to TF32, computes the products without loss of precision, then accumulates those products into an FP32 output (Figure 1).

(from https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/)