Seankala

Seankala OP t1_j9npctd wrote

Thanks for the detailed answer! My use case is that the company I work at currently uses image-based models for e-commerce purposes, but we want to use text-based models as well. The image-based model(s) are already taking up around 30-50M parameters so I didn't want to just bring in a 100M+ parameter model. Even 15M seems quite big.

5

Seankala t1_j17r2we wrote

Also make sure to change your random seed for each run and calculate the mean and variance for each runs' performance on the test set. As a principal, you should always set aside a test set that you never touch other than for performance purposes.

1

Seankala t1_j17qu4r wrote

Frankly I feel the same way about diffusion models. It's nothing more than "whoa cool!" for me. After doing research in NLP in graduate school and doing NLP in the real world, I'm increasingly feeling a huge disconnect between academic research and the real world.

2

Seankala t1_ivdqt68 wrote

I think the word "misinformation" is a little dangerous to be using here. People can criticize the authors of not providing any actual novelty (which is actually super common) but pointing fingers and saying they're "spreading misinformation" is a little much.

44

Seankala OP t1_iuozuvw wrote

Ah thanks for the comment but I think we won't have to account for sentiment information. I should have probably said "publicity" or something rather than "popularity" (it makes sense in my native language). Negative sentiment would also mean that something is trending, and that's what we're trying to measure rather than how positively people would view something.

2

Seankala t1_itk390z wrote

Try looking up DensePhrases, it was made by a colleague of mine and may be what you're looking for. They also have an online demo you can try.

I'm not sure what you mean by "retrieval-based language model" though. I don't think there's any language model that's made solely for the purpose of retrieval.

1