Submitted by LordChips4 t3_101pvlg in MachineLearning

Hello,

I am a phd student working on a project that revolves around the correction of geometrical distortions on images, more specifically the goal is to correct cylindrical distortions in QR Codes in order to improve decoding sucess rate.

So far I implemented traditional methods and got interesting results, but I'm now interested in using machine learning to tackle this problem and since I'm still relatively new to machine learning I would like to hear your feedback/opinions on the subject aswell as sugestions on reading material to start.

So far from my limited research on the matter, I believe a generative adversarial network would probably be the right choice for this problem, but again I'm not sure and I'm really open to all sugestions/ideas.

2

Comments

You must log in or register to comment.

MediumOrder5478 t1_j2p3uau wrote

I would have the network regress the lens distortion parameters (like k1 to k6, p1, p2). You should be able to produce synthetic rendered training data.

5

LordChips4 OP t1_j2pd2ry wrote

The distortion I'm aiming to correct is not from the lens but from a qr code posted on a cylindrical surface ( e.g. qr code posted on lampost), with an unknown radius. So ( at least by my understanding) there's no parameters? So my input would be a distorted qr code image and the output from the trained network would be the predicted qr code image without distortion. Am I wrong in my approach/way of thinking?

3

Pyrite_Pro t1_j2p4q30 wrote

That may be a better approach than just using GANs, given that OP uses these inferred parameters for “lossless” correction. GANs themselves may not reconstruct all image details faithfully.

2

PredictorX1 t1_j2pdcx4 wrote

Have you tried conventional image registration techniques? One common process is to manually or automatically determine matching pairs of points in the image being adjusted and a reference image, and fit linear or low-order polynomials to map the coordinates of one to the other. I'd imagine that radial basis function neural networks would be quite good at making such a mapping.

2

bloc97 t1_j2pfp6l wrote

GANs are generative models, you want a discriminative model (for regression?). You could start by predicting keypoints similar to the task of pose estimation, but in your case, you could predict 3D coordinates for the four corners of the QR code, plus two points to determine the axis of the cylinder. Then you can easily remove the distortion by inverting the cylindrical projection.

1

Realistic_Decision99 t1_j2p6qn0 wrote

The lens distortion models that are used extensively are all linear. Maybe instead of using a GAN you could use a simpler type of network, like a fully connected dense one, to effectively fit an unknown non-linear model. This could reflect additive noise from other types of distortion (e.g. due to sensor topography), or complex lens distortions (combination of multiple distortion effects).

0

LordChips4 OP t1_j2pd8zf wrote

The distortion I'm aiming to correct is not from the lens but from a qr code posted on a cylindrical surface ( e.g. qr code posted on lampost), with an unknown radius. So ( at least by my understanding) there's no parameters? So my input would be a distorted qr code image and the output from the trained network would be the predicted qr code image without distortion. Am I wrong in my approach/way of thinking? I copy pasted the response from above since I feel it fits here aswell!

1