red75prime t1_iynpyob wrote on December 2, 2022 at 7:28 PM

It's not feasible to increase context window due to quadratic growth of required computations.

> It doesn't need more context window to be more useful

It needs memory to be significantly more useful (as in large-scale disruptive) and, probably, other subsystems/capabilities (error detection, continual learning). Its current applications require significant human participation and scaling alone will not change that.

EntireContext OP t1_iynq6u8 wrote on December 2, 2022 at 7:29 PM

I mean the context window will increase with incoming models. GPT-1 had a smaller context window than ChatGPT.

ChronoPsyche t1_iyp04j8 wrote on December 3, 2022 at 1:04 AM

It will increase but the size of increases will slow down without major breakthroughs. You can't predict the rate of future progesss solely based on the rate of past progress in the short term.

You guys take the "exponential growth" stuff way too seriously. All that refers to is technological growth over human history itself, but every time scale doesn't follow the exact same growth patterns. If they did we'd have already reached the singularity a long time ago.

Bottlenecks sometimes occur in the short term and the context-window problem is one such bottleneck.

Nobody doubts that we can solve it eventually, but we haven't solved it yet.

There are potential workarounds like using external memory systems, but that is only a partial workaround for enabling more modest context-window increases. External-memory systems are not feasible for AGI because they are way too slow and do not scale well dynamically, not to mention they are separate from the neural network itself.

In the end, we either need an algorithmic breakthroughs or quantum computers to solve the context-window problem as it relates to AGI. An algorithmic breakthrough is more likely to happen before quantum computers become viable. If it doesn't, then we may be waiting a long time for AGI.

Look into the concept of computational complexity if you want to better understand the issue we are dealing with here.

ReadSeparate t1_iynt062 wrote on December 2, 2022 at 7:48 PM

They can’t just increase it. The context window’s time complexity is O(n^2) which means the amount of compute needed per token added grows exponentially.

This is an architectural constraint of transformers. We’ll either need a better algorithm than transformers, or a way to encode/decode important information to, say, a database and insert it back into the prompt when it’s required

EntireContext OP t1_iyntah2 wrote on December 2, 2022 at 7:50 PM

Well they will make a better algorithm than transformers then (which have already been improved to performers and whatnot).

At any rate, I still see AGI in 2025.

EpicMasterOfWar t1_iyo3tr2 wrote on December 2, 2022 at 9:01 PM

Based on what?

EntireContext OP t1_iyo9fg4 wrote on December 2, 2022 at 9:39 PM

The difference between what was possible in 2019 and what the models can do now.

Back when GPT-2 was out it could barely produce coherent sentences.

This GPTChat model does make mistakes, but it always speaks in a coherent way.

ReadSeparate t1_iyo883j wrote on December 2, 2022 at 9:31 PM

I do agree with this comment. It’s feasible that long term memory isn’t required for AGI (though I think it probably is) or that hacks like reading/writing to a database will be able to simulate long term memory.

I think it may take longer than 2025 to replace transformers though. They’ve been around since 2017 and we haven’t seen any real promising candidates yet.

I can definitely see a scenario where GPT-5 or 6 has prompts built into is training data which are designed to teach it to utilize database read/writes.

Imagine it says hello to you after seeing your name only once six months ago. It could have a read database token which has sub-input tokens to fetch your name from a database based on some sort of identifier.

It could probably get really good at doing this too if it’s actually in the training data.

Eventually, I could see the model using its coding knowledge to design the database/promoting system on its own.

ChronoPsyche t1_iyp084x wrote on December 3, 2022 at 1:05 AM

Eventually, but without any knowledge of specific breakthroughs that will happen very shortly, your 2025 estimation is an uninformed guess at best.

EntireContext OP t1_iyskmjg wrote on December 3, 2022 at 8:56 PM

I don't see a need for specific breakthroughs. I believe the rate of progress we've been seeing since 2012 will get us to AGI by 2025.

ChronoPsyche t1_iytra7q wrote on December 4, 2022 at 2:24 AM

Well you can believe whatever you want but you're not basing those beliefs on anything substantive.

Honestly, the rate of progress since 2012 has been very slow. It's only in the past few years that things have picked up substantially and that was only because of recent breakthroughs with transformer models.

That's kind of how the history of AI progress has worked. We typically have breakthroughs that lead to a surge in progress that eventually plateaus and then stalls for a while as bottlenecks are reached and then eventually a new breakthrough is reached and there is another surge in progress.

It's not guaranteed there will be another plateau before AGI, but we're gonna need new breakthroughs to get there, because as I said, we are approaching bottlenecks with the current technology that will slow down the rate of progress.

That's not necessarily a bad thing, by the way. Our society isn't currently ready to handle AGI. It's good to have some time pass to actually integrate the new technology rather than developing it faster than we can even use it.

[deleted] t1_iypythl wrote on December 3, 2022 at 6:16 AM

[removed]

Have you updated your timelines following ChatGPT?

red75prime t1_iynkzax wrote on December 2, 2022 at 6:54 PM

EntireContext OP t1_iynldwj wrote on December 2, 2022 at 6:56 PM

red75prime t1_iynlrrc wrote on December 2, 2022 at 6:59 PM

EntireContext OP t1_iynm425 wrote on December 2, 2022 at 7:01 PM