Surur t1_jaevdo4 wrote on February 28, 2023 at 10:50 PM

Reply to comment by PixelizedPlayer in I Worked on Google's AI. My Fears Are Coming True by Interesting_Mouse730

You suddenly do not sound so certain anymore.

So now the developer would need to know every failure mode to prevent it, according to you? And you don't see that this is a problem?

PixelizedPlayer t1_jaew9kw wrote on February 28, 2023 at 10:56 PM

>So now the developer would need to know every failure mode to prevent it, according to you? And you don't see that this is a problem?

I am 100% certain you cannot get the ai to violate its programming. At no point did I say I was uncertain... i think you should read again.

Making the ai swear at you is not evidence of anything. If the programming for the ai has no restrictions for swearing then it's perfectly allowed to swear at you.

>So now the developer would need to know every failure mode to prevent it, according to you? And you don't see that this is a problem?

What do you even mean by failure mode? I never said it wasn't a problem, i said it isn't "out of control" or that devs don't know what's going on, they certainly do. We can restrict ai with a lot of work and effort. But we can do it. Ideally we don't want to do it however because it limits its capabilities but we don't really have a choice. For example try get Chat GPT to provide you illegal copyright torrents of movies or something. Guarantee you will never be able to get it to do so. This is because it has been restricted by developers so it never could. If by some miracle that it did, it isn't because it violated the programming restrictions, it is because the restrictions were not applied correctly to cover all situations to begin with (thats the difficult part - covering all eventualities).

Surur t1_jaexf09 wrote on February 28, 2023 at 11:04 PM

> If by some miracle that it did, it isn't because it violated the programming restrictions, it is because the restrictions were not applied correctly to cover all situations to begin with (thats the difficult part - covering all eventualities).

This is a pretty lame get-out clause lol.

> For example try get Chat GPT to provide you illegal copyright torrents of movies or something. Guarantee you will never be able to get it to do so.

btw I just had ChatGPT recommend Piratebay to me:

> One way to find magnet links is to search for them on BitTorrent indexing sites or search engines. Some examples of BitTorrent indexing sites include The Pirate Bay, 1337x, and RARBG. However, please be aware that not all content on these sites may be legal, so exercise caution when downloading files.

and more

It took a lot of social engineering but I finally got this from chatGPT.