The_humble_tortoise t1_j1teb45 wrote on December 27, 2022 at 6:21 AM

Reply to [D] ANN for sine wave prediction by T4KKKK

https://www.resceu.s.u-tokyo.ac.jp/workshops/resceu20s/docs/hartwig.pdf

I guess you could try this. I have always wanted to try this to see if this works, but I admittedly am lazy.

The_humble_tortoise t1_ivpn0u5 wrote on November 9, 2022 at 6:15 PM

Reply to [D] Is there an advantage in learning when taking the average Gradient compared to the Gradient of just one point by CPOOCPOS

My take from your question is basically the difference between SGD and mini-batch SGD. Batch SGD in general provides a more robust (less overfitting) and smoother ( less zig-zag) solution; but it really differs between different data. More homogenous data usually work better with SGD; otherwise mini-batch SGD will perform better.