Viewing a single comment thread. View all comments

Featureless_Bug t1_j9iy4yq wrote

Haven't heard of GLMs being successfully used for NLP and CV in the recent time. And these are like the only things that would be described as large scale in ML. The statement is completely correct - even stuff like gradient boosting does not work at scale in that sense

26

chief167 t1_j9kt5ho wrote

We use gradient boosting at quite a big scale. Not LLM big, but still big. It's just not NLP or CV at all. It's for fraud detection in large transactional tabular datasets. And it outperforms basically all neural network, shallow or deep, approaches.

2

Featureless_Bug t1_j9kuu22 wrote

Large scale is somewhere to the north of 1-2 TB of data. Even if you had that much data, in absolutely most cases tabular data has such a simplistic structure that you wouldn't need that much data to achieve the same performance - so I wouldn't call any kind of tabular data large scale to be frank

−2

relevantmeemayhere t1_j9ki2x1 wrote

Because they are useful for some problems and not others, like every algorithm? Nowhere in my statement did I say they are monolithic in their use across all subdomains of ml

The statement was that deep learning is the only thing that works at scale. It’s not lol. Deep learning struggles in a lot of situations.

0

Featureless_Bug t1_j9kvek5 wrote

Ok, name one large scale problem where GLMs are the best prediction algorithm possible.

1

relevantmeemayhere t1_j9kygtx wrote

Any problem where you want things like effect estimates lol. Or error estimates. Or models that generate joint distributions

So, literally a ton of them. Which industries don’t like things like that?

−2