[D] PyTorch 2.0 Native Flash Attention 32k Context Window Submitted by super_deap t3_11tmpc5 on March 17, 2023 at 9:59 AM in MachineLearning 99 comments 345
tripple13 t1_je5seed wrote on March 29, 2023 at 4:48 PM Reply to comment by mike94025 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap Is that right? I some how end up here when trying to assess what the F.multi_head_attention call does in the Class definition. But I trust you're right, it would only make sense, I just couldn't identify the calls myself. Permalink Parent 1
Viewing a single comment thread. View all comments