Submitted by fedegarzar t3_z9vbw7 in MachineLearning
SherbertTiny2366 t1_iyj00d2 wrote
Reply to comment by cristianic18 in [R] Statistical vs Deep Learning forecasting methods by fedegarzar
>This ensemble is formed by averaging four statistical models: AutoARIMA, ETS, CES and DynamicOptimizedTheta. This combination won sixth place and was the simplest ensemble among the top 10 performers in the M4 competition.
TheBrain85 t1_iymw874 wrote
Pretty biased selection method: the best ensemble in the M4 competition, evaluated on the M3 competition. Although I'm not familiar with these datasets, they're from the same author, so presumably they have significant overlap and similarity. The real question is how hard is it to find such an ensemble without overfitting to the dataset.
SherbertTiny2366 t1_iynjxon wrote
How is it biased to try good-performing ensembles in another data set?
And how is that overfitting?
Furthermore, just because the data sets begin with "M" it does not mean that they "have significant overlap and similarity. "
TheBrain85 t1_iyp1qrz wrote
Because if there's overlap in the datasets, or they contain similar data, the exact ensemble you use is essentially an optimized hyperparameter specific for the dataset. It is exactly the reason that for any hyperparameter optimization cross-validation is used on a set separate from the test set. So using the results on the M4 dataset is akin to optimizing hyperparameters on the test set, which is a form of overfitting.
The datasets are from the same author, same series of competitions: https://en.wikipedia.org/wiki/Makridakis_Competitions#Fourth_competition,_started_on_January_1,_2018,_ended_on_May_31,_2018
"The M4 extended and replicated the results of the previous three competitions"
WikiSummarizerBot t1_iyp1s69 wrote
Makridakis Competitions
Fourth competition, started on January 1, 2018, ended on May 31, 2018
>The fourth competition, M4, was announced in November 2017. The competition started on January 1, 2018 and ended on May 31, 2018. Initial results were published in the International Journal of Forecasting on June 21, 2018. The M4 extended and replicated the results of the previous three competitions, using an extended and diverse set of time series to identify the most accurate forecasting method(s) for different types of predictions.
^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)
SherbertTiny2366 t1_iyq6prw wrote
There is no overlap at all. It’s a completely new dataset. There might be similarities in the sense that there are time series or certain frequencies but in no way could it be the talk of “training in the test” set.
Viewing a single comment thread. View all comments