FoniksMunkee

FoniksMunkee t1_je38yix wrote

Yes, I agree. The paper was fascinating - but a lot of people took away from that the idea that AGI is essentially here. When I read it I saw a couple of issues that may be a speed bump in progress. They definitely underplayed what seems to be a difficult problem to solve with the current paradigm.

2

FoniksMunkee t1_je37buh wrote

I'm pretty sure they mentioned something like that in passing didn't they? I know they have a section in there talking about how it fails at some math and language problems because it can't plan ahead, and it can't make leaps of logic. And it considered these substantial problems with ChatGPT4 with no obvious fix.

4

FoniksMunkee t1_je3705l wrote

Microsoft may have agreed. In the paper they released that talked about "sparks of AGI" - they identified a number of areas that LLM's fail at. Mostly forward planning and leaps of logic or Eureka moments. They actually pointed at LeCun's paper and said that's a potential solution... but that suggests they can't solve it yet with the ChatGPT approach.

3

FoniksMunkee t1_je36ntn wrote

If it really is accelerating exponentially, then most people cheering for this will be out on their arse with no job the next day.

And those that don't lose their job, will lose it shortly afterwards as the next exponential leap comes.

It's amazing tech, but we don't control it, corporations do, and what happens when corporations smell profit?

2

FoniksMunkee t1_jdqtjhv wrote

This opinion is not shared by MS. In their paper discussing the performance of ChatGPT 4 they referred to the inability of ChatGPT 4 to solve some simple maths problems. They commented:

"We believe that the issue constitutes a more profound limitation."

They say: "...it seems that the autoregressive nature of the model which forces it to solve problems in a sequential fashion sometimes poses a more profound difficulty that cannot be remedied simply by instructing the model to find a step by step solution" and "In short, the problem ... can be summarized as the model’s “lack of ability to plan ahead”."

So they went on to say that more training data will help - but will likely not solve the problem and made an offhand comment that a different architecture was proposed that could solve it - but that's not an LLM.

So yes, if you solve the problem - it will be better at reasoning in all cases. But the problem is LLM's work in a way that makes that pretty difficult.

4

FoniksMunkee t1_jdqt5ci wrote

Regarding 2. MS says - "We believe that the ... issue constitutes a more profound limitation."

They say: "...it seems that the autoregressive nature of the model
which forces it to solve problems in a sequential fashion sometimes poses a more profound difficulty that cannot be remedied simply by instructing the model to find a step by step solution" and "In short, the problem ... can be summarized as the model’s “lack of ability to plan ahead”."

Notably, MS did not provide a solution for this - and pointed at another paper by LeCun that suggests a non LLM model to solve the issue. Which is not super encouraging.

2

FoniksMunkee t1_jdqs9x9 wrote

It's a limitation of LLM's as they currently stand. They can't plan ahead, and they can't backtrack.

So a human doing a problem like this would start, see where they get to, perhaps try something else. But LLM's can't. MS wrote a paper on the state of ChatGPT4 and they made this observation about why LLM's suck at math.

"Second, the limitation to try things and backtrack is inherent to the next-word-prediction paradigm that the model operates on. It only generates the next word, and it has no mechanism to revise or modify its previous

output, which makes it produce arguments “linearly”. "

They argue too that the model was probably not trained on as much mathematical data as code - and more training will help. But they also said the issue above "...constitutes a more profound limitation.".

6

FoniksMunkee t1_jdputbl wrote

Even MS are speculating that LLM alone are not going to solve some of the problems they see with ChatGPT's ability to reason. ChatGPT has no ability to plan, or to solve problems that require a leap of logic. Or as they put it, the slow thinking process that overseas the fast thinking process. They have acknowledge solutions proposed by other authors that have recognised the same issue with LLM's have suggested a different architecture may be required. But this seemed to be the least fleshed out part of the paper.

5