The_humble_tortoise t1_ivpn0u5 wrote
My take from your question is basically the difference between SGD and mini-batch SGD. Batch SGD in general provides a more robust (less overfitting) and smoother ( less zig-zag) solution; but it really differs between different data. More homogenous data usually work better with SGD; otherwise mini-batch SGD will perform better.
Viewing a single comment thread. View all comments