DCBAtrader

DCBAtrader t1_j68h2fr wrote

Basic question on regression/AutoML (pycaret mainly).

When do p-values versus error metric (MAE, MSE, R Squared matter).

My previous model building experience (multivariate regression) was to first use various combinations of variables in OLS such that all the variables were statistically significant, and then use an AutoML (pycaret) to build models, and judge them by MAE, MSE or R squared. Using proper cross-validation test/train splits of course.

I'm wondering if this step is needed, and I just can just run the entire data-set in pycaret, and thus judge a model based on said metrics (MAE, MSE, R squared)?

My gut says that the simpler model with stat. significant variables should perform better but maybe I can just look at the best error metric?

1