[P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels Submitted by pommedeterresautee t3_ydqmjp on October 26, 2022 at 6:10 AM in MachineLearning 40 comments 352
pm_me_your_ensembles t1_itw6sti wrote on October 26, 2022 at 7:21 PM Bless you, I needed this :D Permalink 3
Viewing a single comment thread. View all comments