Submitted by TheCockatoo t3_10m1sdm in MachineLearning
HateRedditCantQuitit t1_j60rtsa wrote
This isn't the whole answer, but GANs are super hard to train, while diffusion models are an instance of some much more well understood methods (MLE, score matching, variational inference). That leads to a few things:
- It's more reliable to converge (which leads to enthusiasm)
- It's easier to debug (which leads to progress)
- It's better understood (which leads to progress)
- It's simpler (which leads to progress)
- It's more modular (which leads to progress)
Hypothetically, it could even be that the best simple GAN is better than the best simple diffusion model, but it's easier to iterate on diffusion models, which means we'd still be more able to find the good ways to do diffusion.
tl;dr when I worked on GANs, I felt like a monkey hitting a computer with a wrench to make it work, while when I work on diffusion models, I feel like a mathematician deriving Right Answers™.
Quaxi_ t1_j6421fo wrote
And while being easier to train, they give better results.
Diffusion models are also so much more versatile in their application because of their iterative process.
You can do inpainting or img-to-img for example by just conditioning the noise in different ways. You would have to retrain the whole GAN to achieve that.
bloc97 t1_j63q1nk wrote
>It's simpler (which leads to progress)
I wouldn't say current diffusions models are simpler, in fact they are much more complex than even the most "complex" GAN architectures. However it's exactly because of all the other points that they have become this complex. A vanilla GAN would never be able to endure this much tweaking without mode collapse. Compare that to even the most basic score-based models, which are always stable.
Sometimes, the "It just works™" proposition is much more appealing than pipeline simplicity or speed.
Viewing a single comment thread. View all comments