felheartx
felheartx t1_jcli6si wrote
Reply to comment by -Rizhiy- in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
You said working with external memory is not as straightforward. Can you explain that?
I've read this: https://arxiv.org/abs/2301.04589# and even though I'm not super familiar with the details, to my untrained eye it seems like attaching external memory is easier than extending the context size.
Just from reading posts on this subreddit I get the feeling that getting larger and larger context sizes is very difficult. Whereas simply attaching this sort of "dictionary" thing is pretty easy to do.
felheartx t1_jclij93 wrote
Reply to comment by hfnuser0000 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
I have no idea about how many other ways there are, but this looks extremely promising: https://arxiv.org/abs/2301.04589#
So there's at least one :P