Submitted by fedegarzar t3_zk6h8q in MachineLearning
>TL;DR: We paid USD $800 USD and spend 4 hours in the AWS Forecast console so you don't have to.
In this reproducible experiment, we compare Amazon Forecast and StatsForecast a python open-source library for statistical methods.
Since AWS Forecast specializes in demand forecasting, we selected the M5 competition dataset as a benchmark; the dataset contains 30,490 series of daily Walmart sales.
We found that Amazon Forecast is 60% less accurate and 669 times more expensive than running an open-source alternative in a simple cloud server.
We also provide a step-by-step guide to reproduce the results.
Results
Amazon Forecast:
- achieved 1.617 in error (measured in wRMSSE, the official evaluation metric used in the competition),
- took 4.1 hours to run,
- and cost 803.53 USD.
An ensemble of statistical methods trained on a c5d.24xlarge EC2 instance:
- achieved 0.669 in error (wRMSSE),
- took 14.5 minutes to run,
- and cost only 1.2 USD.
For this data set, we show, therefore, that:
- Amazon Forecast is 60% less accurate and 669 times more expensive than running an open-source alternative in a simple cloud server.
- Classical methods outperform Machine Learning methods in terms of speed, accuracy, and cost.
Although using StatsForecast requires some basic knowledge of Python and cloud computing, the results are better for this dataset.
Table
Zealousideal-Card637 t1_izy2kh1 wrote
Interested comparison. I looked at the full experiments, and Amazon performs slightly better on the bottom level, the actual time series you are forecasting.