Submitted by Smooth-Earth-9897 t3_11nzinb in MachineLearning
Hostilis_ t1_jbr5iul wrote
Reply to comment by multiverseportalgun in [D] What's the Time and Space Complexity of Transformer Models Inference? by Smooth-Earth-9897
Yeah quadratic scaling in context length is a problem lol. Hopefully RWKV will come to the rescue.
Viewing a single comment thread. View all comments