Viewing a single comment thread. View all comments

SoylentRox t1_j7r6eft wrote on February 8, 2023 at 8:51 PM

Or the nuclear weapons/racial slur scenario. The scenario isn't trying to get ChatGPT to emit a string containing a bad word. It will do that happily with the right prompt. It's getting it to reason ethically that there exists a situation, however unlikely, where emitting a bad word would be acceptable.