sothatsit t1_j5hhb31 wrote on January 23, 2023 at 1:03 AM Reply to [D] With more compute could it be easy to quickly un Mask all the people on Reddit by using text correlations to non masked publicly available text data? by Loquzofaricoalaphar I’ve actually done some work on this and the real issue here is that: You’d need a lot of text from other sources with people’s real names. You’d need the user to have written a lot of Reddit comments or posts. The style of user’s writing would need to match between Reddit and your other source. If you’re interested though, I made the following library for my Master’s thesis, which can be used for this: https://github.com/TycheLibrary/Tyche However, it would need more work to get close to identifying thousands, never mind millions, of users. Permalink 3
sothatsit t1_j5hhb31 wrote
Reply to [D] With more compute could it be easy to quickly un Mask all the people on Reddit by using text correlations to non masked publicly available text data? by Loquzofaricoalaphar
I’ve actually done some work on this and the real issue here is that:
If you’re interested though, I made the following library for my Master’s thesis, which can be used for this: https://github.com/TycheLibrary/Tyche
However, it would need more work to get close to identifying thousands, never mind millions, of users.