orbital_lemon t1_j7idllq wrote on February 7, 2023 at 12:27 AM

Reply to comment by GusPlus in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey

It saw stock photo watermarks millions of times during training. Nothing else in the training data comes even close. Even at half a bit per training image, that can add up to memorization of a shape.

Apart from the handful of known cases involving images that are duplicated many times in the training data, actual image content can't be reconstructed the same way.

pm_me_your_pay_slips t1_j7l6icx wrote on February 7, 2023 at 4:31 PM

note that the VQ-VAE part of the SD model alone can encode and decode arbitrary natural/human-made images pretty well with very little artifacts. The diffusion model part of SD is learning a distribution of images in that encoded space.

orbital_lemon t1_j7lel1d wrote on February 7, 2023 at 5:23 PM

The diffusion model weights are the part at issue, no? The question is whether you can squeeze infringing content out of the weights to feed to the vae.