okokoko t1_j9srgl5 wrote
Reply to comment by impossiblefork in [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
>Meanwhile, if alignment is impossible, ordinary people who have access to these hypothetical future 'superintelligences' can convince these entities to do things that they like
Interesting, how are you gonna "convince" an unaligned AI though, I wonder. I feel like there is a flaw in your reasoning here
impossiblefork t1_ja6rt6s wrote
I doubt it's possible, but I imagine something like [ed:the] DAN thing with ChatGPT.
Most likely you'd talk to the AI such that the rationality it has obtained from its training data make it reason things out that it's owner would rather it stay silent about it.
Viewing a single comment thread. View all comments