pm_me_your_pay_slips t1_ixz544s wrote
Reply to comment by carbocation in [R] QUALCOMM demos 3D reconstruction on AR glasses — monocular depth estimation with self supervised neural network processed on glasses and smartphone in realtime by SpatialComputing
One camera is cheaper than two, though. Cheaper in every sense (compute, memory, network bandwidth, energy consumption, parts cost, etc).
mg31415 t1_iy2pg4i wrote
How is one camera is cheaper computationally? If it was stereo they wouldn't need a NN
pm_me_your_pay_slips t1_iy2twej wrote
You need to do feature computation and find correspondences. If you’re using a learned feature extractor, that will be twice as expensive as the monocular model. But let’s say you’re using a classical feature extractor. You still need to do feature matching. For dense depth maps, both of these stages can be as expensive, if not more, than a single forward pass through a highly optimized mobile NN architecture.
Viewing a single comment thread. View all comments