Red-Portal t1_iqpczjb wrote
Reply to comment by 029187 in [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187
People have tried it, and so far no one has been able to achieve the same effect. It's still somewhat of an open research problem.
029187 OP t1_iqpigzv wrote
ah cool! do you have any links to papers on the topic? i'd love to read them!
Red-Portal t1_iqpipq6 wrote
I think it was this one: https://arxiv.org/abs/2103.17182
029187 OP t1_iqrihvc wrote
thanks!!
Viewing a single comment thread. View all comments