turnip_burrito t1_j9i94b1 wrote on February 22, 2023 at 3:53 AM

Now you got me excited about 2-3 years from now when the order of magnitude jumps 10x again or more.

Right now that's a good amount. But when it ncreases again by 10x, that would be enough to handle multiple very large papers, or a whole medium size novel plus some.

In any case, say hello to loading tons of extra info into short term context to improve information synthesis.

You could also do computations within the context window by running mini "LLM programs" within it while working on a larger problem, using it as a workspace to solve a problem.

TFenrir t1_j9iaxdg wrote on February 22, 2023 at 4:09 AM

I really want to see how coherent and sensible it can be at 32k tokens, and a fundamentally better model. Could it write a whole short story off a prompt?

turnip_burrito t1_j9ib4kx wrote on February 22, 2023 at 4:10 AM

That's a really good question. I want to see too how reasoning, coherence, and creativity are affected by large context length.

dasnihil t1_j9jzkpp wrote on February 22, 2023 at 2:54 PM

early glimpses of sophisticated and extremely coherent sounding output.

visarga t1_ja8cjid wrote on February 27, 2023 at 4:31 PM

There's much less long form data to train on. That's problematic.

diabeetis t1_j9irqxa wrote on February 22, 2023 at 7:01 AM

the order of magnitude will always jump by 10x

turnip_burrito t1_j9it0u6 wrote on February 22, 2023 at 7:17 AM

No, could be 100x or 1000x.

redpnd t1_j9j6dl9 wrote on February 22, 2023 at 10:18 AM

Those are two and three order of magnitudes.

turnip_burrito t1_j9j74rx wrote on February 22, 2023 at 10:28 AM

Yes, I said "when the order of magnitude jumps by 10x or more".

Hence a jump by one order of magnitude, (10x), or two orders of magnitude (100x), or three orders of magnitude (1000x).

You can jump by more than one order of magnitude. Diabeetus' comment is wrong, because the order of magnitude can grow more than one order.

diabeetis t1_j9jb6wd wrote on February 22, 2023 at 11:21 AM

I mean who cares but I think the standard way of expressing that thought would be "jumps by 2 or 3 orders of magnitude"

turnip_burrito t1_j9jbkjo wrote on February 22, 2023 at 11:25 AM

I think the intended meaning is quite clear by context, but your point is taken.

redpnd t1_j9kg57f wrote on February 22, 2023 at 5:12 PM

🤗

iamozymandiusking t1_j9ke88e wrote on February 22, 2023 at 5:00 PM

Agreed, but also at this pace, I doubt it will take that long.

grimorg80 t1_j9ke8k6 wrote on February 22, 2023 at 5:00 PM

Two three years?? It's gonna happen way sooner than that.

nexapp t1_j9m0ja0 wrote on February 22, 2023 at 10:59 PM

>https://twitter.com/transitive_bs/status/1628118163874516992?s=20

make it 100x more as the most likely estimate given present lightning speed progression.

gONzOglIzlI t1_j9izs1r wrote on February 22, 2023 at 8:45 AM

I'm I the only one wondering how quantum computers will factor in to all of this?
Feels like a hidden wild card, could expend the token budged exponentially.

turnip_burrito t1_j9j1pe8 wrote on February 22, 2023 at 9:12 AM

Why would it expand the token budget exponentially?

Also we have nowhere near enough qubits to handle these kinds of computations. The number of bits you need to run these models is huge (GPT3 ~170bil or 10^11 parameters). Quantum computers nowadays are lucky to be around 10^3 qubits, and they decohere too quickly to be used for very long (about 10^-4 seconds). * numbers pulled from a quick Google search.

That said, new (classical computer) architectures do exist that can use longer context windows: H3 (Hungry Hungry Hippos) and RWVST or whatever it's called.

D2MAH t1_j9jb0sf wrote on February 22, 2023 at 11:19 AM

I’m not quite sure if quantum computing is even needed for AGI. It’s so far behind and we’re so far ahead.

ChezMere t1_j9owekb wrote on February 23, 2023 at 3:09 PM

Quantum computers work nothing like how you think they do, and are completely useless for AI (as well as almost all other classes of problem).

gONzOglIzlI t1_j9so4my wrote on February 24, 2023 at 7:19 AM

"Quantum computers are completely useless for AI."
Bold prediction, we'll see how it ages.
Can't say I'm anything close to an expert, but I do have masters CS, was a competitive programmer and am a professional programmer now with 10y of xp.

GPT-5entient t1_j9l4ex1 wrote on February 22, 2023 at 7:40 PM

32k tokens would mean approximately 150 kB of text. That is a decent sized code base! Also with this much context memory the known context saving tricks would work much better so this could be theoretically used to create code bases of virtually unlimited size.
This amazes me and also (being a software dev) also scares me...
But, as they say, what a time to be alive!

GoldenRain t1_j9j3fb1 wrote on February 22, 2023 at 9:37 AM

I wonder how expensive each prompt is though.

GPT-5entient t1_j9l5ph0 wrote on February 22, 2023 at 7:47 PM

From that table it looks like it will be 6x more expensive than ChatGPT's model. It looks like you need 600 units per instance vs. 100. Not sure how this translates into raw token cost though, but it seems that it is going to be more expensive once they expose serverless pay-as-you-go pricing. text-davinci-003 is $0.02 per 1k token so this could be $0.12 per kilotoken.

YobaiYamete t1_j9k8s5j wrote on February 22, 2023 at 4:26 PM

> That's about 30k words.

Dude give me this, but for a character Ai style chat or NovelAI style RP

visarga t1_ja8c2yk wrote on February 27, 2023 at 4:28 PM

I tested a paper quickly and it was 20K tokens in 200KB of text.

OpenAI has privately announced a new developer product called Foundry

TFenrir t1_j9i5653 wrote on February 22, 2023 at 3:22 AM