Submitted by Smooth-Earth-9897 t3_11nzinb in MachineLearning
multiverseportalgun t1_jbr55gh wrote
Reply to comment by Hostilis_ in [D] What's the Time and Space Complexity of Transformer Models Inference? by Smooth-Earth-9897
Quadratic 🤢
Hostilis_ t1_jbr5iul wrote
Yeah quadratic scaling in context length is a problem lol. Hopefully RWKV will come to the rescue.
Viewing a single comment thread. View all comments