Submitted by von-hust t3_11jyrfj in MachineLearning
von-hust OP t1_jb55rhp wrote
Reply to comment by JrdnRgrs in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
I think the first version of SD is trained with duplicates, and they made some effort to remove duplicates for training v2 (people on discord are saying pHash or something ismilar). I suppose it'd be interesting to see if the same prompts can be verbatim copied.
Viewing a single comment thread. View all comments