Submitted by super_deap t3_11tmpc5 in MachineLearning
mike94025 t1_je5nrdi wrote
Reply to comment by oathbreakerkeeper in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
You’re looking in the wrong place. What you’re looking at is the BT gen1 fastpath, not the BT gern 2 custom kernels.
You need to look at F.multi_head_attention_forward().
The fastpath still services inference until a full rewrite of activation.py for now that will hopefully be refactored in a future release. (There’s always a tension between refactoring and introducing new features under a tone and staffing constrained problem formulation.)
Viewing a single comment thread. View all comments