Submitted by fujidaiti t3_10pu9eh in MachineLearning
Internal-Diet-514 t1_j6nep37 wrote
Deep learning is only really the better option with higher dimensional data. If Tabular data is 2D, time series is 3D and image data 4D (extra dimension for batch) than deep learning is really only used for 3D and 4D data. As others have said tree based models will most of the time outperform deep learning on a 2D problem.
But I think the interesting thing is the reason we have to use deep learning in the first place. In higher dimensional data we don’t have something that is “a feature” in the sense that we do with 2D data. In time series you have features but they are taken over time so really we need a feature which describes that feature over time. That’s what CNNs do. CNNs are feature extractors and at the end of the process almost always put that data back into 2D format (when doing classification) which is sent through a neural net, but it could be sent through a random forest as well.
I think it’s fair to compare a neural network to traditional ML but when we get into a CNN thats not really a comparison. A CNN is a feature extraction method. The great thing is that we can optimize this step by connecting it to a neural network with a sigmoid (or whatever activation) output.
We don’t have a way to connect traditional ML methods with a feature extraction method in the way you can with back propagation for a neural net and a CNN. If it’s possible to find a way to do that, maybe we would see a rise in the use of traditional ML for high dimensional data.
JimmyTheCrossEyedDog t1_j6nv3zg wrote
This feels like a mix-up between the colloquial and mathematical definitions of dimension Yes, NN approaches tend to work better on very high-dimensional data, but the dimension here refers to the number of input features. So, for a 416x416x3 image, that's >500k dimensions, far higher than the number of dimensions in almost all tabular datasets.
> image data 4D (extra dimension for batch)
The batch is an arbitrary parceling of data simply due to how NNs are typically trained for computational reasons. If I were to train a NN on tabular data, it'd also be batched, but it doesn't give it a new meaningful dimension (either in the colloquial sense or the sense that matters for ML)
Also, NNs are still the best option for computer vision even on greyscale data, which is spatially 2D but still has a huge number of dimensions.
edit: I'd also argue that high dimensionality isn't the biggest reason NNs work for computer vision, but something more fundamental - see qalis's point bin this thread
Internal-Diet-514 t1_j6nzvcc wrote
When talking about dimensions I meant (number of rows, number of features) is 2 dimensions for tabular data. (Number of series, number of time steps, number of features) is 3 dimensions for time series and (number of images, width, height, channels) is 4 dimensions for image data. for deep learning classification, regardless of the number of dimensions it originally ingests it will become (number of series, features) or (number of images, features) when we get to the point of applying an mlp for classification.
You could consider an image to have width x height x channels features but thats not what a CNN does, the cnn extracts meaningful features from the high dimensional space. The feature extraction phase is what makes deep learning great for computer vision. Traditional ML models don’t have that phase.
JimmyTheCrossEyedDog t1_j6odc4c wrote
> When talking about dimensions I meant (number of rows, number of features) is 2 dimensions for tabular data...
Right, but my point is that when people say "NNs work well on high dimensional data", that's not what they mean.
> You could consider an image to have width x height x channels features
It does have that many input features, i.e. dimensions, like you've written below.
> but thats not what a CNN does, the cnn extracts meaningful features from the high dimensional space.
Now we're talking about composite or higher level features, which is different from what we've been talking about up to this point. It's true that for tabular data (or old school, pre-NN computer vision) you generally start to construct these yourself whereas with images you can just throw the raw data in and the NN does this more effectively than you ever could, but this is irrelevant to the input dimensionality.
Internal-Diet-514 t1_j6oi9qu wrote
If we’re considering the dimensions to be the number of datapoints in an input than I’ll stick to that definition and use the shape of the data instead of dimensions. I don’t think I was wrong to use dimensions to describe the shape of the data but I get that it could be confusing because high dimensional data is synonymous with a large number of features, whereas I meant high dimensions to be data with shape > 2.
Deep learning or CNNs are great because of its ability to extract meaningful features from data with shape > 2 and then pass that representation to an mlp. But the feature extraction phase is a different task than what traditional ml is meant to do, which is to take a set of already derived features and learn a decision boundary. So I’m trying to say a traditional ml model is not super comparable to the convolutional portion (feature extraction phase ) of a cnn.
JimmyTheCrossEyedDog t1_j6osqa5 wrote
Good call, shape is the much better term to avoid confusion.
> If we’re considering the dimensions to be the number of datapoints
To clarify - not the number of datapoints, the number of input features. The number of datapoints has nothing to do with the dimensionality (only the shape).
> Deep learning or CNNs are great because of its ability to extract meaningful features from data with shape > 2
This is where I'd disagree (but maybe you have a source that suggests otherwise). Even for time series tabular data, gradient boosted tree models typically outperform NNs.
Overall, shape rarely has anything to do with how a model performs. CNNs are built to take knowledge of the shape of the data into account (restricting kernels to convolutions of spatially close datapoints), but not all NNs do that. If we were using a network with only fully connected layers, for example, then there is no notion of spatial closeness - we might as well have transformed an NxN image into a N^2 x1 vector and your network would be the same.
So, neural networks handling inputs that have spatial (or temporal) relationships well has nothing to do with it being a neural network, but with the assumptions we've baked into the architecture (like convolutional layers).
Internal-Diet-514 t1_j6oujtg wrote
Time series tabular data would have shape 3 (number of series, number of time points, # of features). For gradient boosted tree models isnt the general approach to flatten the space to (number of series, number of time points X # of features). Where as a cnn would be employed to extract time dependent features before flattening the space.
If there’s examples that boosted tree models perform better in this space, and I think you’re right there are, than I think that just goes to show how traditional machine learning isnt dead, but rather if we could find ways to combine it with the thing that makes deep learning work so well (feature extraction) it’d probably do even better.
qalis t1_j6o79xv wrote
A better distinction would be that deep learning excels in application that require representation learning, i.e. transformation from domains that do not lie in Euclidean metric space (e.g. graphs) or that are too problematic in the raw form and require processing in another domain (e.g. images, audio). This is very similar to feature extraction, but representation learning is a bit more general term.
Tabular ML does not need this in general, since after obtaining feature vectors we already have a representation and deep learning like MLP can only apply (exponentially) nonlinear transformation of that space, instead of really learning fundamentally new representations of that data, which is the case e.g. for images, going from raw pixel values space into vector space that captures semantic features in the image.
[deleted] t1_j6now07 wrote
[deleted]
Viewing a single comment thread. View all comments