Featureless_Bug t1_j9iy4yq wrote on February 22, 2023 at 8:22 AM

Reply to comment by relevantmeemayhere in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer

Haven't heard of GLMs being successfully used for NLP and CV in the recent time. And these are like the only things that would be described as large scale in ML. The statement is completely correct - even stuff like gradient boosting does not work at scale in that sense

chief167 t1_j9kt5ho wrote on February 22, 2023 at 6:31 PM

We use gradient boosting at quite a big scale. Not LLM big, but still big. It's just not NLP or CV at all. It's for fraud detection in large transactional tabular datasets. And it outperforms basically all neural network, shallow or deep, approaches.

Featureless_Bug t1_j9kuu22 wrote on February 22, 2023 at 6:41 PM

Large scale is somewhere to the north of 1-2 TB of data. Even if you had that much data, in absolutely most cases tabular data has such a simplistic structure that you wouldn't need that much data to achieve the same performance - so I wouldn't call any kind of tabular data large scale to be frank

relevantmeemayhere t1_j9ki2x1 wrote on February 22, 2023 at 5:24 PM

Because they are useful for some problems and not others, like every algorithm? Nowhere in my statement did I say they are monolithic in their use across all subdomains of ml

The statement was that deep learning is the only thing that works at scale. It’s not lol. Deep learning struggles in a lot of situations.

Featureless_Bug t1_j9kvek5 wrote on February 22, 2023 at 6:45 PM

Ok, name one large scale problem where GLMs are the best prediction algorithm possible.

relevantmeemayhere t1_j9kygtx wrote on February 22, 2023 at 7:03 PM

Any problem where you want things like effect estimates lol. Or error estimates. Or models that generate joint distributions

So, literally a ton of them. Which industries don’t like things like that?