Viewing a single comment thread. View all comments

yolky t1_j3oa0oc wrote

The work is probably not useful for most DL practitioners (yet), but has lots of applications for deep learning research, even for people outside of deep learning theory. As an example consider the work on the Neural Tangent Kernel, which considers one infinite-width limit of neural networks. While the work itself originally was just trying to understand wide fully connected networks, its impact now in 2023 is immense.

A lot of new algorithms for things like active learning, meta learning, etc. use NTK theory as motivation for their development. You could pretty much search "a neural tangent kernel perspective on ____" on google and get a ton of results, a mixture of applied algorithms and theoretical analyses.

So this is just one example of how understanding DL theory leads to better algorithms. One part Greg Yang's work could be considered generalizing NTK theory to different infinite width limits. At the moment, there doesn't seem to be too many applications of his work, but of course the same would have been said about the NTK in 2018. His "Tensor Programs V" paper shows that one application of his work is for choosing hyperparameters for large neural networks using smaller ones as a proxy.

So TL;DR - there might not be practical applications yet, but there are potentially a lot!

4