MustachedSpud t1_j8t65fh wrote
Reply to comment by ChuckSeven in [D] Lion , An Optimizer That Outperforms Adam - Symbolic Discovery of Optimization Algorithms by ExponentialCookie
Yeah very configuration dependent, but larger batch sizes usually learn faster so there's a tendency to lean into that
Viewing a single comment thread. View all comments