Viewing a single comment thread. View all comments

SnooHabits1237 t1_ja9yn94 wrote

Wow I hadn’t thought about that. Like subtly steering the species into a scenario that compromises us in a way that only a 4d chess god could comprehend. That’s dark.

2

Arachnophine t1_jaa76vg wrote

This is also assuming it doesn't just do something we don't understand at all, which it almost certainly would. Maybe it thinks of a way to shuffle the electrons around in its CPU to create a rip in spacetime and the whole galaxy falls into an alternate dimension where the laws of physics favor the AI and organic matter spontaneously explodes. We just don't know.

We can't foresee the actions an unaligned ASI would take in the same way that a housefly can't foresee the danger of an electric high-voltage fly trap. There's just not enough neurons and intelligence to comprehend it.

2

drsimonz t1_jaa68ou wrote

The thing is, by definition we can't imagine the sorts of strategies a superhuman intelligence might employ. A lot of the rhetoric against worrying about AGI/ASI alignment focuses on "solving" some of the examples people have come up with for attacks. But these are just that - examples. The real attack could be much more complicated or unexpected. A big part of the problem, I think, is that this concept requires a certain amount of humility. Recognizing that while we are the biggest, baddest thing on Earth right now, this could definitely change very abruptly. We aren't predestined to be the masters of the universe just because we "deserve" it. We'll have to be very clever.

1

OutOfBananaException t1_jacw2ry wrote

Being aligned to humans may help, but a human aligned AGI is hardly 'safe'. We can't imagine what it means to be aligned, given we can't reach mutual consensus between ourselves. If we can't define the problem, how can we hope to engineer a solution for it? Solutions driven by early AGI may be our best hope for favorable outcomes for later more advanced AGI.

If you gave a toddler the power to 'align' all adults to its desires, plus the authority to overrule any decision, would you expect a favorable outcome?

1

drsimonz t1_jae6cn3 wrote

> Solutions driven by early AGI may be our best hope for favorable outcomes for later more advanced AGI.

Exactly what I've been thinking. We might still have a chance to succeed given (A) a sufficiently slow takeoff (meaning AI doesn't explode from IQ 50 to IQ 10000 in a month), and (B) a continuous process of integrating the state of the art, applying the best tech available to the control problem. To survive, we'd have to admit that we really don't know what's best for us. That we don't know what to optimize for at all. Average quality of life? Minimum quality of life? Economic fairness? Even these seemingly simple concepts will prove almost impossible to quantify, and would almost certainly be a disaster if they were the only target.

Almost makes me wonder if the only safe goal to give an AGI is "make it look like we never invented AGI in the first place".

2