94746382926 t1_iwk1qrv wrote on November 16, 2022 at 5:06 AM

Reply to comment by Dr_Singularity in Cerebras Builds Its Own (1 Exaflop) AI Supercomputer - Andromeda - in just 3 days by Dr_Singularity

Linear speed up in training time, not necessarily in performance. Just wanted to mention that as it's an important distinction.

visarga t1_iwkbncq wrote on November 16, 2022 at 6:59 AM

One Cerebras chip is about 100 top GPUs in speed but in memory it only handles 20B weights, they mention GPT-NeoX 20B. They need to stack 10 of these to train GPT-3.