Submitted by besabestin t3_10lp3g4 in MachineLearning
londons_explorer t1_j60m5ui wrote
Reply to comment by vivehelpme in Few questions about scalability of chatGPT [D] by besabestin
This isn't true.
The model generates 1 token at a time, and if you look at the network connection you can see it slowly loading the response.
I'm pretty sure the speed the answer is returned is as fast as openAI can generate it on their cluster of GPU's.
Viewing a single comment thread. View all comments