Viewing a single comment thread. View all comments

elbiot t1_irwyleo wrote

The fact that you can throw a bunch of compute at transformers is part of their superiority. Even if it's the only factor, its really important

26

_Arsenie_Boca_ t1_irx1ubl wrote

Thats definitely a fair point (although you can do that with recurrent models as well, see reddit link in my other comment). Anyway, the more general point about multiple changes stands, maybe I chose a bad example

3