Spire_Citron t1_ja05dg3 wrote
Reply to comment by gegenzeit in Likelihood of OpenAI moderation flagging a sentence containing negative adjectives about a demographic as 'Hateful'. by grungabunga
Exactly. It may just mean that it's more familiar with hate directed at some groups than others because of how it plays out in the real world, so it's more likely to perceive hate against groups who are often the target of hate as malicious.
Viewing a single comment thread. View all comments