Submitted by blacklemon67 t3_11misax in MachineLearning
harharveryfunny t1_jbjhmif wrote
Humans don't learn by locking themselves in a room at birth with a set of encyclopedias, or a print-out of the internet. We learn by interaction with the world - perceive/generalize/theorize/experiment, learn from feedback, etc.
It's impressive how well these LLM's perform given what is really a very tough task - build an accurate world model given only "predict next word" feedback, but hardly surprising that they need massive amounts of data to compensate for the task being so tough.
harharveryfunny t1_jbjk9nb wrote
Just to follow up, the reason why the "interact with the world" approach is way more efficient is because it's largely curiosity driven - we proactively try to fill gaps in our knowledge rather than just go read a set of encyclopedias and hope it might cover what we need to know. We learn in a much more targeted fashion..
visarga t1_jbn5g3w wrote
On the other hand LLM has broad knowledge about all topics, a true dilettante. We can't keep up on that level.
mckirkus t1_jbkx54l wrote
Hellen Keller is an interesting example of what we are capable of without visual or aural inputs.
farmingvillein t1_jblnh6d wrote
But she still had feedback loops.
currentscurrents t1_jbnandw wrote
I think this is the wrong way to think about what LLMs are doing. They aren't modeling the world; they're modeling human intelligence.
The point of generative AI is to model the function that created the data. For language, that's us. You need all these tokens and parameters because modeling how humans think is very hard.
As LLMs get bigger, they can model us more accurately, and that's where all these human-like emergent abilities come from. They build a world model because it's useful for predicting text written by humans who have a world model. Same thing for why they're good at RL and task decomposition, can convincingly fake emotions, and inherit our biases.
Viewing a single comment thread. View all comments