Viewing a single comment thread. View all comments

justowen4 t1_it645n5 wrote

In case you missed it, LLMs surprised us by being able to scale beyond expectations. The underestimation was because llms came from the nlp world with simple word2vec style word associations. In 2017 the groundbreaking “attention is all you need” paper showed the simple transformer architecture alone with lots of gpu time can outperform other model types. Why? Because it’s not an nlp word association network anymore, it’s a layered context calculator that uses words as ingredients. Barely worth calling them llms unless you redefine language to be integral to human intelligence

9

ftc1234 t1_it64mud wrote

I know what LLMs are. They are a surprising development but the history of technology is littered with surprising discoveries and inventions. But there are very few inventions of the earth shattering variety. And I don’t believe that LLMs are of that variety for the reasons I stated. CNNs were all the rage before LLMs. And respected giants in the field such as Yann LeCun have also stated that LLMs are important but they aren’t everything.

4