maskedpaki t1_j3cxzob wrote on January 7, 2023 at 5:59 PM

Reply to comment by MajorUnderstanding2 in Now that’s pretty significant! (By Anthropic) by MajorUnderstanding2

dont you think they would test it on things outside training data when doing these tests to avoid misleading people

from what Ive heard anthropic have high ethics standards and are primarily into ai safety?

overlordpotatoe t1_j3dlot5 wrote on January 7, 2023 at 8:33 PM

You would think, but if this AI is trained like other AIs where they dump a massive amount of text data into it without necessarily having closely curated it, it would be difficult to know this common riddle wasn't in there somewhere.

Homicidal_Duck t1_j3g6ag9 wrote on January 8, 2023 at 9:10 AM

The point isn't that you'd specifically remove this riddle, or bank on its nonexistence, but more that you'd feed it a riddle that's similar in premise while using little of the same language