Spire_Citron t1_ja05kyo wrote
Reply to comment by MadDragonReborn in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga
Yup. I think if anything this shows it probably wasn't individually programmed to respond to particular things and is just making its judgements based on the hate that it sees in its data.
Viewing a single comment thread. View all comments