We need an acceptable peformance that justifies the inference (and potential hosting) cost. Therefore depending on the complexity of the task, we choose the right size of model to be as cost-efficient as possible.
GPT-3 is not just a 175B model, only its largest version (davinci). There are more lightweight versions as well for less complex tasks such as text classification.
GSG_Zilch t1_jdojtg7 wrote
Reply to Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
We need an acceptable peformance that justifies the inference (and potential hosting) cost. Therefore depending on the complexity of the task, we choose the right size of model to be as cost-efficient as possible.
GPT-3 is not just a 175B model, only its largest version (davinci). There are more lightweight versions as well for less complex tasks such as text classification.