le_bebop t1_it4npm4 wrote on October 20, 2022 at 10:16 PM

Question: Any advice on probabilistic regression with small data (~500 instances, 14 features)?
I'm using xgboost, trying to avoid overfitting with hyperparameter optimization (with hyperopt) to reduce average validation score on 5-fold CV, but still leading to some overfitting (average CV train MAPE 2.85; average test CV MAPE 15.36; test MAPE 18).
I've read that Bayesian models are recommended for such cases of regression on small data, but I'm not familiar (yet) with these models. Could you give any tip or advice to achieve a robust generalization on small data regression? Or recommend some Bayesian library so I can try it.