gionnelles t1_itujc1u wrote on October 26, 2022 at 12:37 PM

Reply to [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee

This is very exciting, my team will be checking this out ASAP. This is fantastic for R&D folks looking to move models towards production with much less effort.