sobagood
sobagood t1_iu6zuhk wrote
Reply to comment by big_dog_2k in [D] How to get the fastest PyTorch inference and what is the "best" model serving framework? by big_dog_2k
If you mean nvidia gpu, it has cuda plugin to run it on nvidia gpu but i have never tried. It has several other plugins so you could check it out. It also provides its own deploy server. Nvidia triton also supports openvino runtime without gpu support with an obvious reason. They have similar process like onnx that transform graph to their intermediate representation with ‘model optimizer’ which could go wrong. If you could successfully create this representation, there should be no new bottleneck.
sobagood t1_iu6ucfo wrote
Reply to [D] How to get the fastest PyTorch inference and what is the "best" model serving framework? by big_dog_2k
If you intend to run on CPU, and other intel hardware, OpenVINO is a great choice. They optimised it for their hardware and it is indeed faster than others on their hardware
sobagood t1_iu801e4 wrote
Reply to comment by big_dog_2k in [D] How to get the fastest PyTorch inference and what is the "best" model serving framework? by big_dog_2k
I dont think they support AMD as they are rival to each other.