Difficult_Ferret2838
Difficult_Ferret2838 t1_ivrom17 wrote
Reply to comment by make3333 in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
>gradient descent takes the direction of the minimum at the step size according to the taylor series of degree n at that point.
No. Gradient descent is first order by definition.
>in a lot of other optimization settings they do second order approx to find the optimal direction
It still isn't an "optimal" direction.
Difficult_Ferret2838 t1_ivrnegq wrote
Reply to comment by make3333 in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
That doesn't mean anything.
Difficult_Ferret2838 t1_ivrnctl wrote
Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
Are you not talking about batch size?
Difficult_Ferret2838 t1_ivqb6wz wrote
Reply to [D] At what tasks are models better than humans given the same amount of data? by billjames1685
Multivariate nonlinear systems, e.g. reactors.
Difficult_Ferret2838 t1_ivtprrn wrote
Reply to comment by kksnicoh in [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
Exactly, that is a meaningless phrase.