Viewing a single comment thread. View all comments

spiky_sugar t1_j3ley7v wrote

It depends. It really varies depending on what parameters you set for the generation. The choice of decoding and output text length can dramatically change the speed and quality of the outcome.

GPT-J-6B model I would say that it is possible to generate 10000 requests in few hours. Using only CPU will take much longer, but you could maybe generate 2000 requests in 24 hours. But again, it is strongly dependent on input and output text length and decoding type.

2