name_not_acceptable t1_ivsi5i1 wrote on November 10, 2022 at 7:46 AM Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS Wouldn't you be better off using the parallel computation to calculate gradient at lots of random points and then see if you can find a better local minima? Ie do they all converge to different points. Permalink 1
name_not_acceptable t1_ivsi5i1 wrote
Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS
Wouldn't you be better off using the parallel computation to calculate gradient at lots of random points and then see if you can find a better local minima? Ie do they all converge to different points.