jnez71 t1_ivp6ril wrote
Reply to comment by CPOOCPOS in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
Oh I should add that from a nonconvex optimization perspective, the volume-averaging could provide heuristic benefits akin to GD+momentum type optimizers. (Edited my first comment to reflect this).
Try playing around with your idea in low dimensions on a classical computer to get a feel for it first. Might help you think of new ways to research it.
Viewing a single comment thread. View all comments