Submitted by GraciousReformer t3_118pof6 in MachineLearning
GraciousReformer OP t1_j9j7iwm wrote
Reply to comment by VirtualHat in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
>Large linear models tend not to scale well to large datasets if the solution is not in the model class
Will you provide me a reference?
VirtualHat t1_j9j8805 wrote
Linear models make an assumption that the solution is in the form of y=ax+b. If the solution is not in this form then the best solution will is likely to be a poor solution.
I think Emma Brunskill's notes are quite good at explaining this. Essentially the model will underfit as it is too simple. I am making an assumption though, that a large dataset implies a more complex non-linear solution, but this is generally the case.
relevantmeemayhere t1_j9kifhu wrote
Linear models are often preferred for the reasons you mentioned. Under fitting is almost always preferred to overfitting.
VirtualHat t1_j9ll5i2 wrote
Yes, that's right. For many problems, a linear model is just what you want. I guess what I'm saying is that the dividing line between when a linear model is appropriate vs when you want a more expressive model is often related to how much data you have.
GraciousReformer OP t1_j9j8bsl wrote
Thank you. I understand the math. But I meant a real world example that "the solution is not in the model class."
VirtualHat t1_j9j8uvr wrote
For example, in IRIS dataset, the class label is not a linear combination of the input. Therefore, if your model class is all linear models, you won't find the optimal or in this case, even a good solution.
If you extend the model class to include non-linear functions, then your hypothesis space now at least contains a good solution, but finding it might be a bit more trickly.
GraciousReformer OP t1_j9jgdmc wrote
But DL is not a linear model. Then what will be the limit of DL?
terminal_object t1_j9jp51j wrote
You seem confused as to what you yourself are saying.
GraciousReformer OP t1_j9jppu7 wrote
"Artificial neural networks are often (demeneangly) called "glorified regressions". The main difference between ANNs and multiple / multivariate linear regression is of course, that the ANN models nonlinear relationships."
PHEEEEELLLLLEEEEP t1_j9k691x wrote
Regression doesnt just mean linear regression, if that's what you're confused about
Acrobatic-Book t1_j9k94l4 wrote
The simplest example is the xor-problem (aka either or). This was also why multilayer perceptrons as the basis of deep learning where actually created. Because a linear model cannot solve it.
VirtualHat t1_j9lkto4 wrote
Oh wow, super weird to be downvoted just for asking for a reference. r/MachineLearning isn't what it used to be I guess, sorry about that.
Viewing a single comment thread. View all comments