Submitted by KD_A t3_127pbst in MachineLearning
KD_A OP t1_jegfh7i wrote
Reply to comment by Jean-Porte in [P] CAPPr: use OpenAI or HuggingFace models to easily do zero-shot text classification by KD_A
Great question! I have no idea lol.
More seriously, it depends on what you mean by "compare". CAPPr w/ powerful GPT-3+ models is likely gonna be more accurate. But you need to pay to hit OpenAI endpoints, so it's not a fair comparison IMO.
If you can't pay to hit OpenAI endpoints, then a fairer comparison would be CAPPr + GPT-2—specifically, the smallest one in HuggingFace, or whatever's closest in inference speed to something like bart-large-mnli
. But then another issue which pops up is that GPT-2 was not explicitly trained on the NLI/MNLI task in the same way bart-large-mnli
was. So I'd need to finetune GPT-2 (small) on MNLI to make a fairer comparison.
If I had a bunch of compute and time, I'd like to benchmark (or find benchmarks) for the following text classification approaches, varying the amount of training data if feasible, and ideally on tasks which are more realistic than SuperGLUE:
- similarity embeddings
- S-BERT
- GPT-3+ (they claim their ada model is quite good)
- sampling
- MNLI-trained models
- CAPPr
Viewing a single comment thread. View all comments