Comments

You must log in or register to comment.

Dr_Singularity OP t1_iwj0ct4 wrote

It Delivers Near Perfect Linear Scaling for Large Language Models

25

94746382926 t1_iwk1qrv wrote

Linear speed up in training time, not necessarily in performance. Just wanted to mention that as it's an important distinction.

10

visarga t1_iwkbncq wrote

One Cerebras chip is about 100 top GPUs in speed but in memory it only handles 20B weights, they mention GPT-NeoX 20B. They need to stack 10 of these to train GPT-3.

8

Lorraine527 t1_iwxm2oj wrote

Gpu's has much more memory per core and that's needed for language models.

1

ihateshadylandlords t1_iwjh7dr wrote

So what are the implications of this? From what I could tell from the article, it looks like it trains LLMs faster.

22

Rakshear t1_iwmo55s wrote

If it can be scaled for mass production we don’t need ai to run on a cloud based structure I think, faster also equals cheaper.

3

Rakshear t1_iwjgzs8 wrote

Wtf? This is freaking awesome, we might actually see a 2030 date for the beginning.

21

AsuhoChinami t1_iwma8zs wrote

The beginning of what?

2

agorathird t1_iwmfs3j wrote

*points toward subreddit name*

15

AsuhoChinami t1_iws4nvm wrote

Omg lol XD XD Yeah, I wasn't sure if he meant the Singularity or AGI.

1