onedertainer t1_ivp6xgj wrote on November 9, 2022 at 4:31 PM Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS This sounds like maybe mini-batch gradient descent, where you use the average or sum of a batch of points, or maybe something like Adam, where you use averaging over epochs to give your gradient descent some "momentum". Permalink 2
onedertainer t1_ivp6xgj wrote
Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
This sounds like maybe mini-batch gradient descent, where you use the average or sum of a batch of points, or maybe something like Adam, where you use averaging over epochs to give your gradient descent some "momentum".