Mefaso

Mefaso t1_j9dbox6 wrote

>IMO reviewers at these journals/conferences need to be more mindful of this kind of plagiarism/low-effort submission.

Workshops in general have a very low bar, this surely wouldn't have been published in the main track.

Other than that I don't really see the point of this rant.

Yes there are a lot of bad papers, there are a lot of bad papers even in the main tracks, you just kind of get used to it.

It feels a lot like hitting down a well. Maybe these are some undergraduates doing their first research project and it's more about learning the methodologies and writing rather than very novel approaches.

16

Mefaso t1_j6z6zgt wrote

>DALL-E 2 also applies diffusion in latent space

Not really in the important part. Dalle2 uses diffusion in clip-"latent"-space and then conditions the pixel-diffusion model on the result.

However they still do a full diffusion pass in pixel-space, which is more complex than doing it in latent space, as LDMs do.

1

Mefaso t1_j29980m wrote

>i found that text to video problem is being actively researched and may not require as much compute as bare language models

There are always opportunities for research with little compute, usually this means your research has to avoid training new models, or at least avoid training from scratch.

However, text to video models are typically very compute extensive

7