Spire_Citron t1_ja05dg3 wrote on February 25, 2023 at 9:14 PM

Reply to comment by gegenzeit in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga

Exactly. It may just mean that it's more familiar with hate directed at some groups than others because of how it plays out in the real world, so it's more likely to perceive hate against groups who are often the target of hate as malicious.