Submitted by SleekEagle t3_yesiy5 in Futurology
Comments
FuturologyBot t1_itzq533 wrote
The following submission statement was provided by /u/SleekEagle:
Background
The past few years have seen incredible progress in the text-to-image domain. Models like DALL-E 2, Imagen, and Stable Diffusion can generate extremely high quality, high resolution images given only a sentence of what should be depicted in the image.
These models rely heavily on Diffusion Models (DMs). DMs are a relatively new type of Deep Learning method that is inspired by physics. They learn how to start with random "TV static", and then progressively "denoise" it to generate an image. While DMs are very powerful and can create amazing images, they are relatively slow.
Researchers at MIT have recently unveiled a new image generation model that also is inspired by physics. While DMs pull from thermodynamics, the new models, PFGMs, pull from electrodynamics. They treat the data points as charged particles, and generate data by moving particles along the electric field generated by the data.
Why they matter
PFGMs (Poisson Flow Generative Models) constitute an exciting area of research for many reasons, but in particular they have been shown to be 10-20x faster than DMs with comparable image quality
Topics of discussion
- With the barrier to generate images getting lower and lower (stable diffusion was released only a couple of months ago!), how will the ability for anyone to create high quality images and art affect the economy and what we perceive to be valuable art in the coming years? Short term, medium term, and long term
- DMs and PFGMs are both inspired by physics. Machine Learning and Deep Learning especially have been integrating concepts from high level math and physics over the past several years. How will the research in these well-developed domains inform the development of Deep Learning? In particular, will leveraging hundreds of years of scientific knowledge and mathematics research be at the foundation of an intelligence explosion?
- Even someone casually interested in AI will have noticed the incredible progress made in the last couple of years. While new models like this are great and advance our understanding, what responsibility do researchers, governments, and private entities have to ensure AI research is being done safely? Or, should we intentionally not be attempting to build any guard rails to begin with?
Please reply to OP's comment here: https://old.reddit.com/r/Futurology/comments/yesiy5/new_deep_learning_method_uses_electrodynamics_to/itznmgg/
YaAbsolyutnoNikto t1_iu3j2hd wrote
Really cool. Does this mean the death of DMs or are they still going to be useful?
In other words, is this new method pareto superior?
SleekEagle OP t1_iu49wmf wrote
I can't imagine they will replace DMs at this point. DMs have been developed significantly in the last two years and there are a lot of notable advancements that have greatly improved their efficiency. Further, stable diffusion runs on DMs and that is already cemented as a pervasive open-source tool.
Further, it is also unclear at first how to add conditioning into this setup and how something like super resolution would work. Definitely not saying it's impossible, just that it is at a very nascent stage. It will be really cool to see how the two interact, such as incorporating DMs and PFGMs together in an Imagen-like setup!
[deleted] t1_iu4ysvp wrote
[removed]
SleekEagle OP t1_itznmgg wrote
Background
The past few years have seen incredible progress in the text-to-image domain. Models like DALL-E 2, Imagen, and Stable Diffusion can generate extremely high quality, high resolution images given only a sentence of what should be depicted in the image.
These models rely heavily on Diffusion Models (DMs). DMs are a relatively new type of Deep Learning method that is inspired by physics. They learn how to start with random "TV static", and then progressively "denoise" it to generate an image. While DMs are very powerful and can create amazing images, they are relatively slow.
Researchers at MIT have recently unveiled a new image generation model that also is inspired by physics. While DMs pull from thermodynamics, the new models, PFGMs, pull from electrodynamics. They treat the data points as charged particles, and generate data by moving particles along the electric field generated by the data.
Why they matter
PFGMs (Poisson Flow Generative Models) constitute an exciting area of research for many reasons, but in particular they have been shown to be 10-20x faster than DMs with comparable image quality
Topics of discussion