Viewing a single comment thread. View all comments

andrew21w t1_itxt7q6 wrote

This looks neat as hell. Is there more literature where can I learn about this?

2

SleekEagle OP t1_itzkagl wrote

This is the first paper on this approach! I spoke to the authors and they're planning on continuing research down this avenue (personally I think dropping a PFGM as the base generator for Imagen and then keeping Diffusion Models for the super resolution chain would be very cool), so be on the lookout for more papers!

3

andrew21w t1_itzpm2n wrote

I would love to see this being used for image to image transformations. I can see plenty of potential for this

2

SleekEagle OP t1_itzqvcr wrote

True! Although it's unclear how to deal with img2img with different sizes

2

andrew21w t1_itzrk2g wrote

Honestly, if one could deal with that, in theory you could do that with latent codes, essentially making a better GAN or whatnot (assuming I got how these models work of course)

2