Hello,

I am a phd student working on a project that revolves around the correction of geometrical distortions on images, more specifically the goal is to correct cylindrical distortions in QR Codes in order to improve decoding sucess rate.

So far I implemented traditional methods and got interesting results, but I'm now interested in using machine learning to tackle this problem and since I'm still relatively new to machine learning I would like to hear your feedback/opinions on the subject aswell as sugestions on reading material to start.

So far from my limited research on the matter, I believe a generative adversarial network would probably be the right choice for this problem, but again I'm not sure and I'm really open to all sugestions/ideas.

Comments

MediumOrder5478 t1_j2p3uau wrote on January 2, 2023 at 11:39 PM

I would have the network regress the lens distortion parameters (like k1 to k6, p1, p2). You should be able to produce synthetic rendered training data.

LordChips4 OP t1_j2pd2ry wrote on January 3, 2023 at 12:43 AM

The distortion I'm aiming to correct is not from the lens but from a qr code posted on a cylindrical surface ( e.g. qr code posted on lampost), with an unknown radius. So ( at least by my understanding) there's no parameters? So my input would be a distorted qr code image and the output from the trained network would be the predicted qr code image without distortion. Am I wrong in my approach/way of thinking?

Pyrite_Pro t1_j2p4q30 wrote on January 2, 2023 at 11:45 PM

That may be a better approach than just using GANs, given that OP uses these inferred parameters for “lossless” correction. GANs themselves may not reconstruct all image details faithfully.

PredictorX1 t1_j2pdcx4 wrote on January 3, 2023 at 12:44 AM

Have you tried conventional image registration techniques? One common process is to manually or automatically determine matching pairs of points in the image being adjusted and a reference image, and fit linear or low-order polynomials to map the coordinates of one to the other. I'd imagine that radial basis function neural networks would be quite good at making such a mapping.

bloc97 t1_j2pfp6l wrote on January 3, 2023 at 1:01 AM

GANs are generative models, you want a discriminative model (for regression?). You could start by predicting keypoints similar to the task of pose estimation, but in your case, you could predict 3D coordinates for the four corners of the QR code, plus two points to determine the axis of the cylinder. Then you can easily remove the distortion by inverting the cylindrical projection.

Novel-Ant-7160 t1_j2rr3bn wrote on January 3, 2023 at 2:25 PM

Maybe something like : https://arxiv.org/abs/1506.02025

[deleted] t1_j2u5s2z wrote on January 3, 2023 at 11:38 PM

[deleted]

SirPotato1 t1_j2ozvvu wrote on January 2, 2023 at 11:12 PM

!remindme 1day

RemindMeBot t1_j2rw929 wrote on January 3, 2023 at 3:04 PM

I will be messaging you in 1 day on 2023-01-03 23:12:47 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

^(Parent commenter can ) ^(delete this message to hide from others.)

^(Info)	^(Custom)	^(Your Reminders)	^(Feedback)

Realistic_Decision99 t1_j2p6qn0 wrote on January 2, 2023 at 11:59 PM

The lens distortion models that are used extensively are all linear. Maybe instead of using a GAN you could use a simpler type of network, like a fully connected dense one, to effectively fit an unknown non-linear model. This could reflect additive noise from other types of distortion (e.g. due to sensor topography), or complex lens distortions (combination of multiple distortion effects).

LordChips4 OP t1_j2pd8zf wrote on January 3, 2023 at 12:44 AM