Submitted by SirDidymus t3_113m61t in singularity
Czl2 t1_j8r42ul wrote
These language models have been trained to predict what language humans will use in a given context so is it surprising that their language feels human? When a mirror shows you your own behavior does that surprise you? Likely not.
These language models are obviously not mirrors but they actually are mirrors if you understand them. A mirror in response to what is in front of it always returns a reflection from it's surface -- a surface that needs not be flat.
In response to a context these language models return "a reflection" from their hyperdimensional manifold of "weights"; these weights act like a fantastically shaped mirror that was designed to minimally distort whatever data the model was trained on.
SirDidymus OP t1_j8r5ezu wrote
What I’m interested in is not so much the reflection you’re describing, but what other reflections appear that were not intended and emerge independently.
Czl2 t1_j8r8hxu wrote
> What I’m interesting is not so much the reflection you’re describing, but what other reflections appear that were not intended and emerge independently.
These language models are trained to predict their training data which is all the human writing the developers of these models could obtain and use for training.
The reflections that appear that were not intended and emerge independently are the mistakes the models make by which you can tell what they generate does not come from a human.
As these models grow in size and improve there will be fewer and fewer of these mistakes till at some point it will not be possible to tell their language from that generated by humans.
You asked for:
>> emerging and unexpected behaviour of recent models.
And you listed examples:
>> *Theory of Mind presenting itself increasingly >> *Bing reluctant to admit a mistake in its information >> *Bing willingly attributing invalid sources and altering sources to suit a narrative >>*Model threatening user when confronted with a breaking of its rules >> *ChatGPT explaining how it views binary data as comparable to colour for humans
These behaviours you would expect in human language would you not? So why would you not expect them in langauge from models trained to imitate human language?
Image I told you that my mirror showed me my face smiling, would you be suprised? Likely not.
(1) “Did the one who constructed the mirror ‘intend’ that it would show me my smile?”
(2) “Did my smile emerge ‘independently’?”
Do these two question make sense in reference to a mirror?
CypherLH t1_j8r9xuf wrote
The mirror analogy doesn't hold up. LLM's are NOT just repeating back the words you prompt them with. They are feeding back plausible human language responses.
It would be like a magic mirror that reflects back a plausible human face with appropriate facial emotive responses to your face...that wouldn't just be a reflection.
MrSheevPalpatine t1_j8rruxx wrote
Is it not a plausible human language response to be reluctant to admit mistakes, to become "agitated" when confronted with your mistakes, or to bend information and sources to fit a narrative that you are being asked to present? I would argue that's Human 101.
CypherLH t1_j8td9s8 wrote
True. And maybe a good reason to NOT want an AI that acts human ;) For some things we want the classical perfect "super Oracle" that just answers our queries but doesn't have the associated baggage of human-level sentience. (whether that sentience is real or fake doesn't really even matter in regards to this issue)
Czl2 t1_j8rc3xr wrote
> The mirror analogy doesn’t hold up. LLM’s are NOT just repeating back the words you prompt them with. They are feeding back plausible human language responses.
Did I say LLM are just repeating back the words you prompt them with? Why then reply as if I said this? Please read my comments again and paste the words that made you believe I said this so that I can correct them.
Here are the words above that I used:
>> These language models are obviously not mirrors but they actually are mirrors if you understand them. A mirror in response to what is in front of it always returns a reflection from it's surface -- a surface that needs not be flat.
…
> It would be like a magic mirror that reflects back a plausible human face with appropriate facial emotive responses to your face…that wouldn’t just be a reflection.
Do you see above me use these words:
>> In response to a context these language models return "a reflection" from their hyperdimensional manifold of "weights"; these weights act like a fantastically shaped mirror that was designed to minimally distort whatever data the model was trained on.
When you hear the words fantastically shaped mirror do you think I am describing a simple flat mirror? A fantastically shaped mirror perhaps another term for that is a “magic mirror”? A magic mirror is a mirror is it not?
> The mirror analogy doesn’t hold up.
AFAIK the mirror analogy is the best I can come up with. Do you have a better analogy?
CypherLH t1_j8tcuc3 wrote
Ok, fair enough. I still think using any sort of mirror analogy breaks down rapidly though. If the "mirror" is so good at reflecting that its showing perfectly plausible scenes that respond in perfectly plausible ways to whatever is aimed into it...is it really even any sort of mirror at all any more?
Czl2 t1_j8txkgb wrote
> Ok, fair enough. I still think using any sort of mirror analogy breaks down rapidly though. If the “mirror” is so good at reflecting that its showing perfectly plausible scenes that respond in perfectly plausible ways to whatever is aimed into it…is it really even any sort of mirror at all any more?
Do you see above where I use the words:
>> These language models are obviously not mirrors but they actually are mirrors if you understand them.
Later on in that comment I describe them as “fantastically shaped mirrors”. I used those words because much like the surface of a mirror once trained LLM’s are “frozen” — given the same inputs they always yield the same outputs.
The static LLM weights are a multidimensional manifold that defines this the mirror shape. If when we switch away from electrons to photons to represent the static LLM weights they may indeed be represented by elementary components that act like mirrors. How else might the paths of photons be affected?
Another analogy for LLMs comes from the Chinese room thought experiment: https://en.wikipedia.org/wiki/Chinese_room Notice however that fantastically shaped mirror surfaces can implement look up tables and the process of computation at a fundemental level involves the repeated use of look up tables — when silicon is etched to make microchips we are etching it with circuits that implement look up tables.
LLM’s weights are a set of look up tables (optimized during training to best predict human language) which when given some new input always map it to the same output. Under the hood there is nothing but vector math yet to our our eyes it looks like human langauge and human thinking. And when you can not tell A from B how can you argue they are different? That is what the Turing test is all about.
For a long time now transhumansts have speculated about uploading minds into computers. I contend that these these LLM’s are partial “mind uploads”. We are uploading “language patterns” of all the minds that generated what the models are being trained on. The harder it is to judge LLM output from what it is trained on the higher fidelity of this “upload”.
When DNA was first sequenced most of the DNA was common person to person and we learned that fraction of DNA that makes you a unique person (vs other people) is rather small. It could be that with language and thinking the fraction that makes any one of us unique is similarly rather small. The better LLM get at imitating individual people the more will will know how large / small these personality differences are.
CypherLH t1_j8udpth wrote
Interesting points though I personally detest the Chinese Room Argument since by its logic no human can actually be intelligent either...unless you posit that humans have something magical that lets them escape the Chinese Room logic.
Czl2 t1_j8umouq wrote
> Interesting points though I personally detest the Chinese Room Argument since by its logic no human can actually be intelligent either…
I suspect you have a private definition for the term “intelligent“ else you misunderstand the Chinese Room argument. The argument says no matter how intelligent it seems a digital computer executing a program cannot have a "mind", "understanding", or "consciousness".
> unless you posit that humans have something magical that lets them escape the Chinese Room logic.
Yes the argument claims there is something magical about human minds such that the logic of the Chinese Room does not apply to them and this part of the argument resembles the discredited belief in vitalism:
>> Vitalism is a belief that starts from the premise that "living organisms are fundamentally different from non-living entities because they contain some non-physical element or are governed by different principles than are inanimate things."
CypherLH t1_j8uoh2l wrote
I understand the Chinese Room argument, I just think its massively flawed. As I pointed out before, if you accept its premise then you must accept that NOTHING is "actually intelligent" unless you invoke something like the "vitalism" you referenced and claim humans have special magic that makes them "actually intelligent"...which is mystic nonsense and must be rejected from a materialist standpoint.
The Chinese Room Argument DOES show that no digital intelligence could be the same as _human_ intelligence but that is just a form of circular logic and not useful in any way; its another way of saying "a non-human intelligence is not a human mind". That is obviously true but also a functionally pointless and obvious statement.
Czl2 t1_j8v60kl wrote
Visit Wikipedia or Britannica encyclopedia and compare what I told you against your understanding. I expect you will discover your understanding does not match what is generally accepted. Do you think these encyclopedias are both wrong?
Here is the gap in bold:
> As I pointed out before, if you accept its premise then you must accept that NOTHING is 'actually intelligent' unless you invoke something like the "vitalism" you referenced and claim humans have special magic that makes them...
The argument does not pertain to intelligence. To quote my last comment:
>> The argument says no matter how intelligent it seems a digital computer executing a program cannot have a "mind", "understanding", or "consciousness".
Do you see the gap? Your concept is "actually intelligent". The accepted concepts are: "mind", "understanding", or "consciousness" regardless of intelligence. A big difference, is it not?
CypherLH t1_j8vdxku wrote
I'll grant there is a gap there..... but it actually makes the whole thing _weaker_ than I was granting...cause I don't give a shit about whether an AI system is "conscious" or "understanding" or a "mind", those are BS meaningless mystical terms. What I care about is the practical demonstration of intelligence; what measurable intelligence does a system exhibit. I'll let priests and philosophers debate about whether its "really a mind" and how many angels can dance on the head of a pin while I use the AI to do fun or useful stuff.
Czl2 t1_j9030la wrote
> I’ll grant there is a gap there….. but it actually makes the whole thing weaker than I was granting…
What you described as the Chinese room argument is not the commonly accepted Chinese room “argument”. Your version was about “intelligence” the accepted version is about “conscious” / “understanding” / “mind” regardless how intelligent the machine is.
Whether the commonly accepted Chinese room argument is “weaker“ is difficult to judge due to the difference between them. I expect to judge whether a machine has “conscious” / “understanding” / “mind” will be harder than judging whether that machine is intelligent.
To judge intelligence there are objective tests. Are there objective tests to judge “consciousness” / “understanding” / “mind”? I suspect not.
> cause I don’t give a shit about whether an AI system is “conscious” or “understanding” or a “mind”, those are BS meaningless mystical terms.
For you they are “meaningless mystical terms”. For many others these are important aspects that they believe make humans “human”. They care about these things because these things determine how mechanical minds are viewed and treated by society.
When you construct an LLM today you are free to delete it. When you create a child however you are not free to “delete it”. If ever human minds are judged to be equaivalent to machine minds will machine minds come to be treated like human minds?
Will instead human minds come to be treated like machine minds which we are free to do with as we please (enslave / delete / ...)? When human minds come to be treated like machines will it make sense to care whether they suffer? To a machine what is suffering? Is your car “suffering” when check engine light is on? It is but a “status light” is it not?
> What I care about is the practical demonstration of intelligence; what measurable intelligence does a system exhibit. I’ll let priests and philosophers debate about whether its “really a mind” and how many angels can dance on the head of a pin while I use the AI to do fun or useful stuff.
I understand your attitude since I share it.
[deleted] t1_j8s6oh3 wrote
[deleted]
CypherLH t1_j8tdk9p wrote
well yes, but the same is ultimately true of people as well if you are totally reductive. Unless you think humans have some soul or magic essence to them.
[deleted] t1_j8tfpb8 wrote
[deleted]
CypherLH t1_j8up0yr wrote
Your assertion is obviously true NOW and not many people are seriously claiming that chatGPT and other current LLM's are actually conscious or AGI. The thing is they sure seem to be showing a massive step down the path towards getting those things. A legit argument can be made that we're now looking at something approaching proto-AGI...which is wild, this was science fiction even a year ago.
Viewing a single comment thread. View all comments