reSAMpled t1_iuj2yrm wrote on October 31, 2022 at 6:33 PM

Reply to comment by pommedeterresautee in [P] Up to 12X faster GPU inference on Bert, T5 and other transformers with OpenAI Triton kernels by pommedeterresautee

I also haven't used spaCy in a while, but I am pretty sure there is not a way to make this work with -sm, -md or -lg models, but what Michaël says should be true for -trf models, but I don't think it will be easy. Already spacy-transformers has to wrap HF models so they have a thinc API, you would have to dig deep in there to call Kernl's optimize_model