onedertainer t1_ivp6xgj wrote
This sounds like maybe mini-batch gradient descent, where you use the average or sum of a batch of points, or maybe something like Adam, where you use averaging over epochs to give your gradient descent some "momentum".
Viewing a single comment thread. View all comments