Viewing a single comment thread. View all comments

Taenk t1_j60gdbl wrote

Do these also increase inference speed? How much work is it to switch from CUDA based software to one of these?

4

cdsmith t1_j60q0bs wrote

I can only answer about Groq. I'm not trying to sell you Groq hardware, honestly... I just honestly don't know the answers for other accelerator chips.

Groq very likely increases inference speed and power efficiency over GPUs; that's actually its main purpose. How much depends on the model, though. I'm not in marketing so I probably don't have the best resources here, but there are some general performance numbers (unfortunately no comparisons) in this article, and this one talks about a very specific case where a Groq chip gets you a 1000x inference performance advantage over the A100.

To run a model on a Groq chip, you would typically start before CUDA enters the picture at all, and convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program using https://github.com/groq/groqflow. If you have custom-written CUDA code, then it's likely you've got some programming work ahead of you to run on something besides a GPU.

7

lucidrage t1_j61so7l wrote

>convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program

Are there any effort spend in adding a plugin for a high level framework like keras to automatically use groq?

1

cdsmith t1_j62a3yv wrote

I'm not aware of any effort to build it into Keras, but Keras models are one of the things you can easily convert to Groq chips using groqflow.

1