PredictorX1 t1_j5h5pb5 wrote
Reply to comment by Loquzofaricoalaphar in [D] With more compute could it be easy to quickly un Mask all the people on Reddit by using text correlations to non masked publicly available text data? by Loquzofaricoalaphar
The biggest technical challenges I see:
- Having enough reference samples from known people
- The difference how people write on Reddit and how they write elsewhere (professional articles, e-mail, etc.: presumably used as reference)
- If too many Reddit users are being considered, it may all dissolve into mush (estimated probabilities would all be low)
Loquzofaricoalaphar OP t1_j5h6s4z wrote
That is interesting to think about. I’m biased to think text patterns have lots of variables and are fairly unique. Perhaps it’s more of a model than compute problem to analyze it at scale and not get mush.
Viewing a single comment thread. View all comments