BellyDancerUrgot t1_jds7iva wrote on March 26, 2023 at 7:53 PM

Reply to comment by StrippedSilicon in Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700

The reason I say it’s a recontextualization and lacks deeper understanding is because it doesn’t hallucinate sometimes , it hallucinates all the time, sometimes the hallucinations align with reality that’s all. Take this thread for eg:

A system that fully understood the underlying structure of the question would not give you varying answers with the same prompt.

Inconclusive is the third likeliest answer. Despite having a big bias toward the correct answer (keywords like dubious for eg) it still makes mistakes to a rather simple question. Sometimes it does get it right with the bias sometimes even without the bias.

Language imo lacks causality for intelligence since it’s a mere byproduct of intelligence. Which is why these models imo hallucinate all the time, and sometimes the hallucinations line up with reality and sometimes they don’t. The likelihood of the prior is just increased because of the huge train size.

StrippedSilicon t1_jdt7h5o wrote on March 27, 2023 at 12:18 AM

So... how does it solve a complicated math problem it hasn't seen before exactly with only regurgitating information?

BellyDancerUrgot t1_jdtci38 wrote on March 27, 2023 at 12:59 AM

Well let me ask you, how does it fail simple problems if it can solve more complex ones? If you solve these problems analytically then it stands to reason that you wouldn’t be making an error , ever, for a simple question as that.

StrippedSilicon t1_jdte8lj wrote on March 27, 2023 at 1:13 AM

That's why I'm appealing to "we don't actually understand what it's doing" case. Certainly the AGI-like intelligence explanation falls apart in alot of cases, but the explanation of only spitting out the training data in a different order or context doesn't work either.