nul9090

nul9090 t1_j9th1xg wrote

Well, at the moment, we can't really have any idea. Quadratic complexity is definitely really bad. It limits how far we can push the architecture. It makes it hard to make it on to consumer hardware. But if we are as close to a breakthrough as some people believe maybe it isn't a problem.

6

nul9090 t1_j9sqmaf wrote

In my view, the biggest flaw of transformers is the fact that they have quadratic complexity. This basically means they will not become significantly faster anytime soon. The context window size will grow slowly too.

Linear transformers and Structured State Space Sequence (S4) models are promising approaches to solve that though.

My hunch it that LLMs should be very useful in the near-term but, in the future, they will be of little value to AGI architecture but I am unable to convincingly explain why.

27

nul9090 t1_j97krdy wrote

The hostility was uncalled for. What you're asking for is a lot of work for a Reddit post. But there are plenty of tests and anecdotes that would lead one to believe it is lacking in important ways in its capacity to reason and understand.

I'm not a fan of Gary Marcus but he raises valid criticisms here in a very recent essay: https://garymarcus.substack.com/p/how-not-to-test-gpt-3

Certainly, there are even more impressive models to come. I believe firmly that, some day, human intelligence will be surpassed by a machine.

2