SleekEagle OP t1_iu56cs1 wrote on October 28, 2022 at 5:03 PM

Reply to comment by Education-Sea in new physics-inspired Deep Learning method generates images with electrodynamics by SleekEagle

Note that PFGMs are not text-conditioned yet! There's still work to be done there :)

SleekEagle OP t1_iu569xx wrote on October 28, 2022 at 5:02 PM

Reply to comment by blueSGL in new physics-inspired Deep Learning method generates images with electrodynamics by SleekEagle

I don't think the paper explicitly says anything about this, but I would expect them to be similar. If anything I would imagine they would require less memory, but not more. That having been said, if you're thinking of e.g. DALL-E 2 or Stable Diffusion, those models also have other parts that PFGMs don't (like text encoding networks), so it is completely fair that they are larger!

SleekEagle OP t1_iu55u04 wrote on October 28, 2022 at 4:59 PM

Reply to comment by dasnihil in new physics-inspired Deep Learning method generates images with electrodynamics by SleekEagle

The deep dive section gives an overview of Green's functions! Don't be intimidated by the verbiage, the central ideas are not too complicated :)

If you have taken a multivariable calculus class then most of it should make sense

SleekEagle OP t1_iu49wmf wrote on October 28, 2022 at 1:19 PM

Reply to comment by YaAbsolyutnoNikto in New Deep Learning Method uses Electrodynamics to Generate Images by SleekEagle

I can't imagine they will replace DMs at this point. DMs have been developed significantly in the last two years and there are a lot of notable advancements that have greatly improved their efficiency. Further, stable diffusion runs on DMs and that is already cemented as a pervasive open-source tool.

Further, it is also unclear at first how to add conditioning into this setup and how something like super resolution would work. Definitely not saying it's impossible, just that it is at a very nascent stage. It will be really cool to see how the two interact, such as incorporating DMs and PFGMs together in an Imagen-like setup!

SleekEagle OP t1_itzrcnd wrote on October 27, 2022 at 2:30 PM

Reply to comment by Mefaso in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

Thanks!

SleekEagle OP t1_itzrai9 wrote on October 27, 2022 at 2:30 PM

Reply to comment by Ulfgardleo in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

Oh, gotcha! Good point - I'll have to take another look but I think it might've been DDPMs from 2020

SleekEagle OP t1_itzqvcr wrote on October 27, 2022 at 2:27 PM

Reply to comment by andrew21w in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

True! Although it's unclear how to deal with img2img with different sizes

SleekEagle OP t1_itznmgg wrote on October 27, 2022 at 2:04 PM

Reply to New Deep Learning Method uses Electrodynamics to Generate Images by SleekEagle

Background

The past few years have seen incredible progress in the text-to-image domain. Models like DALL-E 2, Imagen, and Stable Diffusion can generate extremely high quality, high resolution images given only a sentence of what should be depicted in the image.

These models rely heavily on Diffusion Models (DMs). DMs are a relatively new type of Deep Learning method that is inspired by physics. They learn how to start with random "TV static", and then progressively "denoise" it to generate an image. While DMs are very powerful and can create amazing images, they are relatively slow.

Researchers at MIT have recently unveiled a new image generation model that also is inspired by physics. While DMs pull from thermodynamics, the new models, PFGMs, pull from electrodynamics. They treat the data points as charged particles, and generate data by moving particles along the electric field generated by the data.

Why they matter

PFGMs (Poisson Flow Generative Models) constitute an exciting area of research for many reasons, but in particular they have been shown to be 10-20x faster than DMs with comparable image quality

Topics of discussion

With the barrier to generate images getting lower and lower (stable diffusion was released only a couple of months ago!), how will the ability for anyone to create high quality images and art affect the economy and what we perceive to be valuable art in the coming years? Short term, medium term, and long term
DMs and PFGMs are both inspired by physics. Machine Learning and Deep Learning especially have been integrating concepts from high level math and physics over the past several years. How will the research in these well-developed domains inform the development of Deep Learning? In particular, will leveraging hundreds of years of scientific knowledge and mathematics research be at the foundation of an intelligence explosion?
Even someone casually interested in AI will have noticed the incredible progress made in the last couple of years. While new models like this are great and advance our understanding, what responsibility do researchers, governments, and private entities have to ensure AI research is being done safely? Or, should we intentionally not be attempting to build any guard rails to begin with?

SleekEagle OP t1_itzkagl wrote on October 27, 2022 at 1:41 PM

Reply to comment by andrew21w in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

This is the first paper on this approach! I spoke to the authors and they're planning on continuing research down this avenue (personally I think dropping a PFGM as the base generator for Imagen and then keeping Diffusion Models for the super resolution chain would be very cool), so be on the lookout for more papers!

SleekEagle OP t1_itzk3mv wrote on October 27, 2022 at 1:39 PM

Reply to comment by Ulfgardleo in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

That claim is pulled directly from the paper. While I have not verified it myself, I would not be surprised if it were true if they are using discrete time diffusion models. I'm not very familiar with continuous time diffusion models, however, so maybe it would be a lot closer in that case!

SleekEagle OP t1_itzjvvo wrote on October 27, 2022 at 1:38 PM

Reply to comment by Serverside in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

Got it! Yeah the ultimate crux of it is the proof that any continuous compact distribution has a field that approaches uniform flux density at infinity

SleekEagle OP t1_itxpvth wrote on October 27, 2022 at 1:47 AM

Reply to comment by Serverside in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

My pleasure! I'm not sure I understand exactly what you're asking, could you try to rephrase it? In particular, I'm not sure what you mean by preservation of the data distribution.

Maybe this will help answer: Given an exact Poisson field generated by a continuous data distribution, PFGMs provide an exact deterministic mapping to/from a uniform hemisphere. While we do not know this Poisson field exactly, we can estimate given many data points sampled from the distribution. PFGMs therefore provide a deterministic mapping between the uniform hemisphere and the distribution that corresponds to the learned empirical field, but not exactly to the data distribution itself. Given a lot of data, though, we expect this approximate distribution to be very close to the true distribution (universal to all generative models)

Thanks for reading! I did have a little trouble getting the repo working on my local machine, so I might expect some trouble if I were you. I reached out to the authors while writing this article and I believe they are planning on continuing research into PFGMs, so don't forget to keep an eye out for future developments!

SleekEagle OP t1_itwepjh wrote on October 26, 2022 at 8:10 PM

Reply to comment by Serverside in [D] Poisson Flow Generative Models - new physics inspired generative model by SleekEagle

u/Serverside There are a few things to note here.

First, The non-stochasticity allows for likelihood evaluation. Second, it allows for the authors to use ODE solvers (RK45 in the case of PFGMs) instead of SDE solvers, potentially combined with an application-specific method like Langevin MCMC for score-based models. Further, for diffusion there are many discrete time steps that need to be evaluated in series (at least for the usual discrete time diffusion models). The result is that PFGMs are faster than these stochastic methods. Lastly, the particular ODE for PFGMs has weaker norm-time correlation than other ODEs, in turn making the sampling more robust.

As for the deterministic mapping, it is actually the reason (at least in part) that e.g. interpolations work for PFGMs. The mapping is a continuous transformation, mapping "near points to near points" by definition. I think the determinism ensures that interpolated paths in the latent space will transform to a well-behaved path in the data space, whereas a stochastic element would very likely break this path. The stochasticity in VAEs is useful to learn parameters of a distribution and is required to sample from this distribution, but once the point is sampled it is (usually) deterministically mapped back to the data space iirc.

SleekEagle t1_is6trdz wrote on October 13, 2022 at 6:47 PM

Reply to comment by Atom_101 in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime

I'm really excited to see what the next few years bring. I've always felt like there's a lot of room for growth from higher level math, and it seems like that's beginning to happen.

I'll have a blog coming out soon on another physics-inspired model like DDPM, stay tuned!

SleekEagle t1_is6a5ce wrote on October 13, 2022 at 4:41 PM

Reply to comment by badabummbadabing in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime

Exactly! While they might be losing out in specific applications, the concept itself is still very valuable imo

SleekEagle t1_is69o54 wrote on October 13, 2022 at 4:37 PM

Reply to comment by Atom_101 in [D] Are GAN(s) still relevant as a research topic? or is there any idea regarding research on generative modeling? by aozorahime

There are models that use continuous DE's rather than discrete iterations, both with Diffusion-adjacent methods like SMLD/NCSN and distinct methods!

SleekEagle t1_irx747e wrote on October 11, 2022 at 6:32 PM

Reply to comment by xEdwin23x in [D] Looking for some critiques on recent development of machine learning by fromnighttilldawn

Is progress really a question? It seems very obvious that we have made progress in the last 5 years, and just looking at GANs seems ridiculous when Diffusion Models are sitting right there. Not trying to be a jerk, genuinely curious if anyone actually thinks that progress as a whole is not being made?

I definitely sympathize with the "incremental progress" that comes down to 0.1% better performance on some imperfect metric which occurs between big developments (GANs, transformers, diffusion models), but ignoring those papers and looking at bigger trends it seems obvious that really incredible progress has been made.

SleekEagle t1_irx6j3n wrote on October 11, 2022 at 6:28 PM

Reply to comment by _Arsenie_Boca_ in [D] Looking for some critiques on recent development of machine learning by fromnighttilldawn

I think it's more about the parallelizability of Transformers than anything. For all intents and purposes that makes them better than LSTMs and any recurrent model in general imo.

SleekEagle t1_ir1bnhl wrote on October 4, 2022 at 4:47 PM

Reply to comment by twocupv60 in [D] How do you go about hyperparameter tuning when network takes a long time to train? by twocupv60

You have just enough to use Google Colab!

SleekEagle t1_ir0hr1m wrote on October 4, 2022 at 1:27 PM

Reply to [D] Why restrict to using a linear function to represent neurons? by MLNoober

The outputs of neurons are passed through a nonlinearity which is essential to any (complex) learning process. If we didn't do this, then the NN would be a composition of linear functions which is itself a linear function (pretty boring).

As for why we choose to operate on inputs with an affine transformation before putting them through a nonlinearity, I see two reasons. The first is that linear transformations are well understood and succinct to use theoretically. The second is that computers (in particular GPUs) are very good with matrix multiplication, so we do a lot of "heavy lifting" with them and then just pass the result through a nonlinearity so we don't get a boring learning process.

Just my 2 cents, happy for input/feedback!