Viewing a single comment thread. View all comments

le_bebop t1_it4npm4 wrote

Question: Any advice on probabilistic regression with small data (~500 instances, 14 features)?
I'm using xgboost, trying to avoid overfitting with hyperparameter optimization (with hyperopt) to reduce average validation score on 5-fold CV, but still leading to some overfitting (average CV train MAPE 2.85; average test CV MAPE 15.36; test MAPE 18).
I've read that Bayesian models are recommended for such cases of regression on small data, but I'm not familiar (yet) with these models. Could you give any tip or advice to achieve a robust generalization on small data regression? Or recommend some Bayesian library so I can try it.

1