andrew21w t1_itxt7q6 wrote on October 27, 2022 at 2:13 AM

This looks neat as hell. Is there more literature where can I learn about this?

SleekEagle OP t1_itzkagl wrote on October 27, 2022 at 1:41 PM

This is the first paper on this approach! I spoke to the authors and they're planning on continuing research down this avenue (personally I think dropping a PFGM as the base generator for Imagen and then keeping Diffusion Models for the super resolution chain would be very cool), so be on the lookout for more papers!

andrew21w t1_itzpm2n wrote on October 27, 2022 at 2:18 PM

I would love to see this being used for image to image transformations. I can see plenty of potential for this

SleekEagle OP t1_itzqvcr wrote on October 27, 2022 at 2:27 PM

True! Although it's unclear how to deal with img2img with different sizes

andrew21w t1_itzrk2g wrote on October 27, 2022 at 2:32 PM

Honestly, if one could deal with that, in theory you could do that with latent codes, essentially making a better GAN or whatnot (assuming I got how these models work of course)