Submitted by super_deap t3_11tmpc5 in MachineLearning
CleanThroughMyJorts t1_jck7114 wrote
Reply to comment by Spiritual-Reply5896 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
I don't think the two are mutually exclusive.
The problem with retrieval though (at least current implementations) is the model can't attend to memory globally the way it does with context memory; you're bottlenecked by the retrieval process having to bring things into context through a local search.
[deleted] t1_jckd4tg wrote
[removed]
Viewing a single comment thread. View all comments