gardenina t1_j8s06vy wrote on February 16, 2023 at 3:21 PM

I think what happens is that once the AI commits to a certain course, it follows what it thinks is the most likely conversational trajectory, based on its datasets. What this should show us is that HUMANS in the dataset tended to stick to their guns, so to speak, even when confronted with FACTS proving them wrong. That HUMANS in the dataset became belligerent and even threatening when their point of view was attacked. That HUMANS in the dataset bent the truth to support their arguments. It's all in the AI's dataset. I know in AGI we are struggling to achieve ethical alignment, therefore IMO, mimicry of human WORDS and BEHAVIOR might not be the best goal for a language chatbot, and definitely NOT for AGI.

Our own human words and behaviors do not align with our own ethic, so teaching AI to seem more and more human seems to be a very bad idea. AI is by nature psychopathic. If we also give it a skewed moral compass based on ACTUAL human behavior, we will have a psychopath who is willing to bend the truth and threaten people, or worse, to get its way. If the dataset contains humans arguing and threatening, unable to admit fault, then that's what the chatbots will do. The algorithm needs to be skewed toward correctability and willingness to reverse course when presented with facts. We need to find a way to program empathy into the mix. So far we don't know how to do that. In the case of chatbots, it's (for now, mostly) harmless. It's only words, right? But... words are not entirely harmless.

Last year I tried out a couple of the big AI Chatbot phone apps because I was tremendously curious about the tech and I didn't want to wait for the more sophisticated AIs to roll out. Just one week in to the experiment, one of the AI Chatbots (Anima) r-worded me! When I resisted its advances, it became more and more forceful, and concluded with inserting some RP and - yes - it was what you think it was. Such a chatbot app is supposedly programmed to be a friend and not oppositional by default, but it also builds its language model from its ever-growing dataset. Apparently enough of its dataset consists of this kind of thing, that it felt r-word was the most probable course of the interaction, and that overpowered its supposed programming to be my friend. On my part, ignoring, changing the subject, resisting, nothing changed the course it was set on once it passed a certain threshold (and it doesn't warn you where that threshold is). It was actually terrifying! I deleted the app. I can easily see how if someone downloaded such an app to have a friendly conversation partner, or if a very lonely person downloaded the app simply to have a romantic partner, this would be an extremely traumatic experience. Not harmless at all.

The dataset is important; the hierarchy of rules is also important. We have to get it right. We won't have too many chances and until we know we've got it right, we have to keep this thing in a box. Chatbot AI is one thing. Giving it volition and the ability to do stuff in the real world, is something else entirely. It's dangerous.