Submitted by von-hust t3_11jyrfj in MachineLearning
von-hust OP t1_jb5ef3f wrote
Reply to comment by LetterRip in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
I would, but I don't have the CLIP features. I'll release some training
code so that it's possible for others to train their indices. The method
should scale to 5B, even on a single node, you'll just need more RAM.
Viewing a single comment thread. View all comments