Submitted by CPOOCPOS t3_yql3wl in MachineLearning
Hello, I'm wondering if during learning tasks there is an advantage in taking the average Gradient of a small Volume in our parameter space compared to just taking the Gradient of the one center point in that parameter-space volume.
Edit:
I noticed that i have some people confused, and rightly so, because i didn't mention some aspects. The computational overhead of the average of such many points not important, why is not important but it has to do with the fact that this can be extremely cheap on a quantum computer. My question more precisely would be, given the face that computational overhead can be ignored, are there in general theoretical advantages in taking the average over a volume in comparison to just at a point.
I am a physicist and have only some experience in ML. My thesis is at the moment in Quantum Machine Learning and this question ahs been central to my research for some weeks. But unfortunately i can't find anything online that regards this question. I was wondering if someone here, would have some insights to this question.
[deleted] t1_ivotas3 wrote
[deleted]