Batteredcode

Batteredcode t1_jci3t9m wrote

Great, thank you so much for a detailed answer. Do you have anything you could point me to (or explain further) about how I could modify a diffusion method to do this?
Also, in terms of the VAE, I was thinking I'd be able to feed 2 channels in and train it to output 3 channels, I believe the encoder wouldn't be useless in this case and hence my latent would be more than merely the missing channel? Feel free to correct me if I'm wrong! My assumption is that even with this a NN may well perform better, or at least a simpler baseline. That said, my images will be similar in certain ways, so being able to model a distribution of the latents could prove useful presumably?

1

Batteredcode t1_jccqitv wrote

I'm looking to be able to train a model that is suited to taking an image and reconstructing it with additional information, for example, taking R&G channels for an image and recreating it with the addition of the B channel. On first glance it seems like an in-painting model would be best suited to this, and treat the missing information as the mask, however I don't know if this assumption is correct as I've not got too much experience with those kinds of models. Additionally, I'm looking to progress from a really simple baseline to something more complex, so I was wondering if an architecture of a simple CNN or an autoencoder trained to output the target image given image missing information, but I may be way off here. Any help greatly appreciated!

1