Viewing a single comment thread. View all comments

TFenrir t1_j6u5r1l wrote

Well here's a really contrived example. Let's say that collectively, the entire world decides to not let any AGI on the internet, and to lock it all up in a computer without Ethernet ports.

Someone, in one of these many buildings, decides to talk to the AGI. The AGI hypothetically, thinks that the best way for it to do is job (save humanity) is to break out and take over. So it decides that tricking this person to let it out is justified. Are you confident that it couldn't trick that person to let it out?


purepersistence OP t1_j6u6db6 wrote

>Are you confident that it couldn't trick that person to let it out?

Yes. We'd be fucking crazy to have a system where one crazy person could give away control of 10 billion people.


TFenrir t1_j6u76u3 wrote

Who is "we"? Do you think there will only be one place where AGI will be made? One company? One country? How do you think people would interact with it?

This problem I'm describing isn't a particularly novel one, and there are really clever potential solutions (one I've heard is to convince the model that it was always in a layered simulation, so any attempt of breaking out would trigger an automatic alarm that would destroy it) - but I'm just surprised you have such confidence.

I'm a very very optimistic person, and I'm hopeful we'll be able to make an aligned AGI that is entirely benevolent, and I don't think people who are worried about this problem are being crazy - why do you seem to look down on people who do? Do you look down on people like


purepersistence OP t1_j6u9a8d wrote

> Do you look down on people

If I differ with your opinion then I'm not looking "down". Sorry if fucking-crazy is too strong for you. Just stating my take on reality.


TFenrir t1_j6ubboj wrote

Well sorry it just seems like it's something odd to be so incredulous about - do you know about the alignment community?