magical_mykhaylo

magical_mykhaylo t1_j2sj2fc wrote

This is a very general issue, often called "the curse of dimensionality", or the "short and wide" problem. There are a number of ways to do it, that fall generally under the umbrella term of "dimensionality reduction". It's really tricky not to over-fit these types of models, but here are some things you can try:

You can reduce the number of features using Principal Component Analysis (PCA), Independent Component Analysis (ICA), or UMAP. Using PCA or ICA, and speaking in broad terms, you are not training your model on the inividual variables themselves, but rather linear combinations of those variables as "latent variables".

You can select the most relevant features, using feature or variable selection prior to training your algorithm. This can be done in the context of Random Forests using GINI coefficients or any number of other similar metrics.

If you are training a linear model, such as Linear Discriminant Analysis (LDA) there are generally higher-dimensional variants that incorporate elastic net regularisation to better handle problems with dimensionality. Look up "spare regression" for more information. Some of these algorithms also use Partial Least Squares (PLS) as a way around it, but it has fallen out of fashion in most fields.

If you are building a neural network (generally a bad idea if you have fewer samples), you might consider using regularisation coefficients for the hidden layers.

7