inglandation t1_jdij4o8 wrote
Reply to comment by Username2upTo20chars in [D] "Sparks of Artificial General Intelligence: Early experiments with GPT-4" contained unredacted comments by QQII
> why should the next generation be fundamentally different?
Emergent abilities from scale are the reason. There are many examples of that in nature and many fields of study. The patterns of snowflakes cannot easily be explained by the fundamental properties of water. You need enough water molecules in the right conditions to create the patterns of snowflakes. I suspect that a similar phenomenon is happening with LLMs, but we haven't figured out yet what the patterns are and what are the right conditions for them to materialize.
Username2upTo20chars t1_jdj8a6k wrote
I don't disagree with the phenomena of emergence, it's just that it doesn't explain anything. It is one word for "I have no idea how it works" or better: its magic. The issue I have with that is that you are quick to hide behind that word, using it as an explanation, accepted as the emergence has become.
But in fact you can't model one bit with it, it has no predictive power and it kind of shuts down discussions.
So far I haven't seen any evidence (have you?) that LLMs aren't doing anything else but predicting the next token. Yes there are certain thresholds, where they do overcome the one or other weakness. But in the end they just predict the next token better ... and even better. Impressive what you can do with that (chinese room like), but that doesn't imply that GPT4 is any different than GPT3.5, it's just better.
But as I wrote, you can in theory replace most non-manual work with that somewhere down the line anyway. But no GPT will develop you some ground-breaking Deep Learning architecture or solve important physics problems which need actual thought and not just more compute or...
Not that you claimed that - I do here -, but should GPT-7 or so suddenly do that, then you can hold me to it.
inglandation t1_jdjvmqe wrote
> you can't model one bit with it, it has no predictive power and it kind of shuts down discussions.
For now yes, my statement is not very helpful. But this is a phenomenon that happens in other fields. In physics, waves or snowflakes are an emergent phenomenon, but you can still model them pretty well and make useful predictions about them. Life is another example. We understand life pretty well (yes there are aspects that we don't understand), but it's not clear how we go from organic compounds to living creatures. Put those molecules together in the right amount and in the right conditions for a long time, and they start developing the structures of life. How? We don't know yet, but it doesn't stop us from understanding life and describing it pretty well.
Here we don't really know what we're looking at yet, so it's more difficult. We should figure out what the structures emerging from the training are.
I don't disagree that LLMs "just" predict the next token, but there is an internal structure that will pick the right word that is not trivial. This structure is emergent. My hypothesis here is that understanding this structure will allow us to understand how the AI "thinks". It might also shed some light on how we think, as the human brain probably does something similar (but maybe not very similar). I'm not making any definitive statement, I don't think anyone can. But I don't think we can conclude that the model doesn't understand what it is doing based on the fact that it predicts the next token.
I think that the next decades will be about precisely describing what cognition/intelligence is, and in what conditions exactly it can appear.
Viewing a single comment thread. View all comments