Viewing a single comment thread. View all comments

t1_j7kzd3i wrote

> https://arxiv.org/pdf/1706.03762.pdf the paper that made all this possible.

That's reaching IMHO. The original transformer was only around a few million parameters in size. It's not even in the realm of the level of ChatGPT.

You may as well say that MIT invented it as Googles paper is based on methods created by them.

2

t1_j7ladne wrote

Please without the transformer we would never be able to scale, not to mention all of this being built on BERT as well. Then a bunch of companies scaled it further including Google

0

t1_j7p3gn4 wrote

> Please without the transformer we would never be able to scale,

Without back propagation we wouldn't have transformers. 🤷‍♂️

2