Submitted by von-hust t3_11jyrfj in MachineLearning
graphicteadatasci t1_jbdt33t wrote
Reply to comment by enjakuro in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
Yeah, because there's some very nice results on classification models where they remove data that doesn't contribute to learning and it made training faster and more accurate. But of course I can't remember at all what the paper was called.
enjakuro t1_jbf0yco wrote
Same hahaha, would've linked it otherwise xD
Viewing a single comment thread. View all comments