Viewing a single comment thread. View all comments

sobagood t1_iu6ucfo wrote

If you intend to run on CPU, and other intel hardware, OpenVINO is a great choice. They optimised it for their hardware and it is indeed faster than others on their hardware

9

whata_wonderful_day t1_iu81vzp wrote

I tried OpenVINO ~1.5 years back and it didn't match ONNXRuntime on transformers. For CNNs it's the fastest though. I also found OpenVINO to be pretty buggy and not user friendly. I needed to fix their internal transformer conversion script

4

big_dog_2k OP t1_iu6yc7b wrote

Thanks! Does it work with non-intel chipsets and how easy have you found it to use?

1

sobagood t1_iu6zuhk wrote

If you mean nvidia gpu, it has cuda plugin to run it on nvidia gpu but i have never tried. It has several other plugins so you could check it out. It also provides its own deploy server. Nvidia triton also supports openvino runtime without gpu support with an obvious reason. They have similar process like onnx that transform graph to their intermediate representation with ‘model optimizer’ which could go wrong. If you could successfully create this representation, there should be no new bottleneck.

1

big_dog_2k OP t1_iu7paw3 wrote

Thanks. I might need to take a closer look. I was also thinking AMD and arm based cpu. I was surprised at how good the cpu based inference can be for some models these days.

1

sobagood t1_iu801e4 wrote

I dont think they support AMD as they are rival to each other.

1