Viewing a single comment thread. View all comments

KD_A OP t1_jeggt1k wrote

Yes, exactly. There's nothing else to it haha

I only wish the API had an interface to let you cache the prompt's keys and values. That'd save you money, and make CAPPr strictly cheaper than sampling for classification tasks.

2

PassingTumbleweed t1_jegonam wrote

Cool! I wonder if you've thought about synonyms. It seems like there might be a lot of cases where classes with more synonyms (or even cases like plurality , eg bird vs birds) are at a disadvantage.

2

KD_A OP t1_jegsqe6 wrote

That's a good criticism. I'd guess that this issue is quite problem-dependent. And I'd hope that an LM is good enough to discriminate between the correct-but-many-synonyms class and the wrong-but-few-synonyms class. (We're using the word synonym, but we really mean "high probability token path given prompt".) It's hard for me to come up with examples where this problem arises in a real classification task. But they may be out there.

2

PassingTumbleweed t1_jegvhb5 wrote

What I was thinking is that some kind of hierarchical LLM taxonomy might be interesting, where you can re-jigger the conditional probability tree onto any arbitrary vocab of token sequences.

2

KD_A OP t1_jegxas8 wrote

Interesting, and I think I know what you mean. One naive idea is a "top-k tokens" system. This system considers the top k highest probability tokens (conditional on previous ones) for each completion token, and for each completion. And then take the sum of the average likelihoods across all k^n (n = # completion tokens) paths for each completion. That would be one way to address this synonym problem. But ofc it results in way more computation.

Edit: actually, thinking a bit more, I think the synonym problem is more-or-less a non-issue for LMs trained to do next-token prediction.

2

PassingTumbleweed t1_jeh0p1j wrote

I'm curious to get your thoughts about a simple example where you have three classes: cat, dog, and bird. What happens if the top-1 prediction is "eagle"? Does that probability mass get discarded? Because it should probably go into the bird category

1

KD_A OP t1_jeh0ygl wrote

Yup it gets totally discarded. Hopefully, the conditional probability of bird is higher than cat or dog.

2

PassingTumbleweed t1_jeh1248 wrote

One thing I've seen with these LLMs is that you can prompt them with the classes using sort of a multiple choice style. It would be interesting to experiment with whether this can stabilize the outputs and reduce the amount of out of vocabulary predictions you get

2