KD_A OP t1_jegfh7i wrote on March 31, 2023 at 8:49 PM

Reply to comment by Jean-Porte in [P] CAPPr: use OpenAI or HuggingFace models to easily do zero-shot text classification by KD_A

Great question! I have no idea lol.

More seriously, it depends on what you mean by "compare". CAPPr w/ powerful GPT-3+ models is likely gonna be more accurate. But you need to pay to hit OpenAI endpoints, so it's not a fair comparison IMO.

If you can't pay to hit OpenAI endpoints, then a fairer comparison would be CAPPr + GPT-2—specifically, the smallest one in HuggingFace, or whatever's closest in inference speed to something like bart-large-mnli. But then another issue which pops up is that GPT-2 was not explicitly trained on the NLI/MNLI task in the same way bart-large-mnli was. So I'd need to finetune GPT-2 (small) on MNLI to make a fairer comparison.

If I had a bunch of compute and time, I'd like to benchmark (or find benchmarks) for the following text classification approaches, varying the amount of training data if feasible, and ideally on tasks which are more realistic than SuperGLUE:

similarity embeddings
- S-BERT
- GPT-3+ (they claim their ada model is quite good)
sampling
MNLI-trained models
CAPPr