pommedeterresautee t1_iuaodj2 wrote on October 29, 2022 at 9:36 PM

Reply to comment by big_dog_2k in [D] How to get the fastest PyTorch inference and what is the "best" model serving framework? by big_dog_2k

Yes for Ampere.

For HF models, the Kernels will work for most of them out of the box but you need to have search replace patterns for your specific architecture. That's why we do not have our own implementations of X and Y.

Check https://github.com/ELS-RD/kernl/blob/main/src/kernl/optimizer/linear.py for an example.

big_dog_2k OP t1_iuaw55q wrote on October 29, 2022 at 10:35 PM

Great. I might try this out as I like the direction this is going plus it seems like Pytorch is heading in a similar way. I'll let you know if I have questions or I will raise them on github. I appreciate all the information!