Zealousideal-Card637 t1_izy2kh1 wrote on December 12, 2022 at 6:23 PM

Interested comparison. I looked at the full experiments, and Amazon performs slightly better on the bottom level, the actual time series you are forecasting.

SherbertTiny2366 t1_izy50ew wrote on December 12, 2022 at 6:38 PM

For Hierarchical and sparse data it is quite common to see models achieving good accuracy in the bottom levels but being very bad at higher aggregation levels. This is the case because the models are systematically under or over predicting.

mangotheblackcat89 t1_izzighp wrote on December 13, 2022 at 12:04 AM

IMO, this is an important consideration. Sure, the target level is SKU-store, but at what level are the purchase orders being made? The M5 Competition didn't say anything about this, but probably the SKU level is as important as the SKU-store, if not more.

For retail data in general, I think we need to see how well a method perfoms at different levels of the hierarchy. I've seen commercial and finance teams prefer a forecast that is more accurate at the top than another that is slightly more accurate at the bottom.

-Rizhiy- t1_j018jx5 wrote on December 13, 2022 at 10:01 AM

Do you by any chance have a resource that explains that a bit more?

I can't get my head around how a collection of accurate forecasts, can produce an inaccurate aggregate.

Is it related to class imbalances or perhaps something like Simpson's paradox?

SherbertTiny2366 t1_j01t4du wrote on December 13, 2022 at 1:45 PM

Imagine this toy example. You have 5 series, which are very sparse, as is often the case in retail. For example, series 1 has sales on Mondays and 0's the rest of the days, series 2 on Tuesdays, series 3 on Wednesdays, and so on. For those individual series, a value close to 0 would be more or less accurate, however, when you add all the predictions up, the value will be way below the true value.

-Rizhiy- t1_j030en1 wrote on December 13, 2022 at 6:43 PM

Thank you, that makes sense.

xgboostftw t1_j02hrl4 wrote on December 13, 2022 at 4:46 PM

where do you see the full experiment? I think only the results table from Amazon is published, no?

fedegarzar OP t1_j04jp9e wrote on December 14, 2022 at 12:40 AM

Here are the results: https://github.com/Nixtla/statsforecast/tree/main/experiments/amazon_forecast
Here is the step-by-step guide to reproduce results: https://nixtla.github.io/statsforecast/examples/aws/statsforecast.html
Here are the steps for Amazon Forecast: https://nixtla.github.io/statsforecast/examples/aws/amazonforecast.html

Here is the data:
Train set: https://m5-benchmarks.s3.amazonaws.com/data/train/target.parquet
Temporal exogenous variables (used by AmazonForecast): https://m5-benchmarks.s3.amazonaws.com/data/train/temporal.parquet
Static exogenous variables (used by AmazonForecast): https://m5-benchmarks.s3.amazonaws.com/data/train/static.parquet