029187 OP t1_iqrinm2 wrote
Reply to comment by dasayan05 in [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187
If its only as good, then it has no benefit. But if it ends up being better, then it is useful for situations where we have enough memory.
​
https://arxiv.org/abs/2103.17182
​
This paper here is claiming they might have found interesting ways to potentially make it better.
Viewing a single comment thread. View all comments