Submitted by GraciousReformer t3_118pof6 in MachineLearning
VirtualHat t1_j9j2gwx wrote
Reply to comment by relevantmeemayhere in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
Large linear models tend not to scale well to large datasets if the solution is not in the model class. Because of this lack of expressivity, linear models tend to do poorly on complex problems.
relevantmeemayhere t1_j9khp8m wrote
As you mentioned, this is highly dependent on the functional relationship of the data.
You wouldn’t not use domain knowledge to determine that.
Additionally, non linear models tend to have their own drawbacks. Lack of interpretability, high variability being some of them
GraciousReformer OP t1_j9j7iwm wrote
>Large linear models tend not to scale well to large datasets if the solution is not in the model class
Will you provide me a reference?
VirtualHat t1_j9j8805 wrote
Linear models make an assumption that the solution is in the form of y=ax+b. If the solution is not in this form then the best solution will is likely to be a poor solution.
I think Emma Brunskill's notes are quite good at explaining this. Essentially the model will underfit as it is too simple. I am making an assumption though, that a large dataset implies a more complex non-linear solution, but this is generally the case.
relevantmeemayhere t1_j9kifhu wrote
Linear models are often preferred for the reasons you mentioned. Under fitting is almost always preferred to overfitting.
VirtualHat t1_j9ll5i2 wrote
Yes, that's right. For many problems, a linear model is just what you want. I guess what I'm saying is that the dividing line between when a linear model is appropriate vs when you want a more expressive model is often related to how much data you have.
GraciousReformer OP t1_j9j8bsl wrote
Thank you. I understand the math. But I meant a real world example that "the solution is not in the model class."
VirtualHat t1_j9j8uvr wrote
For example, in IRIS dataset, the class label is not a linear combination of the input. Therefore, if your model class is all linear models, you won't find the optimal or in this case, even a good solution.
If you extend the model class to include non-linear functions, then your hypothesis space now at least contains a good solution, but finding it might be a bit more trickly.
GraciousReformer OP t1_j9jgdmc wrote
But DL is not a linear model. Then what will be the limit of DL?
terminal_object t1_j9jp51j wrote
You seem confused as to what you yourself are saying.
GraciousReformer OP t1_j9jppu7 wrote
"Artificial neural networks are often (demeneangly) called "glorified regressions". The main difference between ANNs and multiple / multivariate linear regression is of course, that the ANN models nonlinear relationships."
PHEEEEELLLLLEEEEP t1_j9k691x wrote
Regression doesnt just mean linear regression, if that's what you're confused about
Acrobatic-Book t1_j9k94l4 wrote
The simplest example is the xor-problem (aka either or). This was also why multilayer perceptrons as the basis of deep learning where actually created. Because a linear model cannot solve it.
VirtualHat t1_j9lkto4 wrote
Oh wow, super weird to be downvoted just for asking for a reference. r/MachineLearning isn't what it used to be I guess, sorry about that.
Viewing a single comment thread. View all comments