Submitted by enryu42 t3_122ppu0 in MachineLearning
Haycart t1_jdu7hlp wrote
Reply to comment by visarga in [D] GPT4 and coding problems by enryu42
Oh, you are probably correct. So it'd be O(N^2) overall for autoregressive decoding. Which still exceeds the O(n log n) that the linked post says is required for multiplication, though.
Viewing a single comment thread. View all comments