Viewing a single comment thread. View all comments

turnip_burrito t1_j9i94b1 wrote

Now you got me excited about 2-3 years from now when the order of magnitude jumps 10x again or more.

Right now that's a good amount. But when it ncreases again by 10x, that would be enough to handle multiple very large papers, or a whole medium size novel plus some.

In any case, say hello to loading tons of extra info into short term context to improve information synthesis.

You could also do computations within the context window by running mini "LLM programs" within it while working on a larger problem, using it as a workspace to solve a problem.

52

TFenrir t1_j9iaxdg wrote

I really want to see how coherent and sensible it can be at 32k tokens, and a fundamentally better model. Could it write a whole short story off a prompt?

24

turnip_burrito t1_j9ib4kx wrote

That's a really good question. I want to see too how reasoning, coherence, and creativity are affected by large context length.

13

dasnihil t1_j9jzkpp wrote

early glimpses of sophisticated and extremely coherent sounding output.

1

visarga t1_ja8cjid wrote

There's much less long form data to train on. That's problematic.

1

diabeetis t1_j9irqxa wrote

the order of magnitude will always jump by 10x

7

turnip_burrito t1_j9it0u6 wrote

No, could be 100x or 1000x.

3

redpnd t1_j9j6dl9 wrote

Those are two and three order of magnitudes.

5

turnip_burrito t1_j9j74rx wrote

Yes, I said "when the order of magnitude jumps by 10x or more".

Hence a jump by one order of magnitude, (10x), or two orders of magnitude (100x), or three orders of magnitude (1000x).

You can jump by more than one order of magnitude. Diabeetus' comment is wrong, because the order of magnitude can grow more than one order.

1

diabeetis t1_j9jb6wd wrote

I mean who cares but I think the standard way of expressing that thought would be "jumps by 2 or 3 orders of magnitude"

0

turnip_burrito t1_j9jbkjo wrote

I think the intended meaning is quite clear by context, but your point is taken.

3

grimorg80 t1_j9ke8k6 wrote

Two three years?? It's gonna happen way sooner than that.

3

nexapp t1_j9m0ja0 wrote

>https://twitter.com/transitive_bs/status/1628118163874516992?s=20

make it 100x more as the most likely estimate given present lightning speed progression.

2

gONzOglIzlI t1_j9izs1r wrote

I'm I the only one wondering how quantum computers will factor in to all of this?
Feels like a hidden wild card, could expend the token budged exponentially.

0

turnip_burrito t1_j9j1pe8 wrote

Why would it expand the token budget exponentially?

Also we have nowhere near enough qubits to handle these kinds of computations. The number of bits you need to run these models is huge (GPT3 ~170bil or 10^11 parameters). Quantum computers nowadays are lucky to be around 10^3 qubits, and they decohere too quickly to be used for very long (about 10^-4 seconds). * numbers pulled from a quick Google search.

That said, new (classical computer) architectures do exist that can use longer context windows: H3 (Hungry Hungry Hippos) and RWVST or whatever it's called.

5

D2MAH t1_j9jb0sf wrote

I’m not quite sure if quantum computing is even needed for AGI. It’s so far behind and we’re so far ahead.

4

ChezMere t1_j9owekb wrote

Quantum computers work nothing like how you think they do, and are completely useless for AI (as well as almost all other classes of problem).

1

gONzOglIzlI t1_j9so4my wrote

"Quantum computers are completely useless for AI."
Bold prediction, we'll see how it ages.
Can't say I'm anything close to an expert, but I do have masters CS, was a competitive programmer and am a professional programmer now with 10y of xp.

0