PascualMariaEugenia María Eugenia Pascual

6
Jia Wu goes to UK María Eugenia Pascual 1. Using the data of years 1997-2010, develop a model for forecasting the sales of the year 2011. 2. Compare the predictions of your model with the actual sales of 2011 included in the data set. 3. Justify the choice of the method used. In order to determine the model that I am going to use I’ve developed three different models and I’ve compared all of them to see which model fits better. For each model I’ve used multiplicative seasonals and the predicted plot (red) is the plot of the values of the years 1997-2010 plus the forecasted values of 2011. For the first model what I used to calculate the trend is a linear expression. The trend and the seasonals are the following: Trend = 2.369.914 + 3.390*t Seasonals 0,96; 0,87; 1,099; 0,906; 0,883; 1,068; 0,88; 0,873; 1,095; 0,933; 0, 1,434 The graphs below show the actual and the predicted sales using the model. As we can see this model doesn’t fit well. As the trend is calculated with a linear expression we see that as we have a positive slope the predicted sales are increasing which it could not be true. It doesn’t take into account variations of the sales. The correlation of the actual and predicted sales is 0,89: t Forecast Upper limit Lower limit 170 2.825.29 0 3.198.69 2 2.451.88 7 1

Transcript of PascualMariaEugenia María Eugenia Pascual

Page 1: PascualMariaEugenia María Eugenia Pascual

Jia Wu goes to UK María Eugenia Pascual

1. Using the data of years 1997-2010, develop a model for forecasting the sales of the year 2011.2. Compare the predictions of your model with the actual sales of 2011 included in the data set.3. Justify the choice of the method used.

In order to determine the model that I am going to use I’ve developed three different models and I’ve compared all of them to see which model fits better. For each model I’ve used multiplicative seasonals and the predicted plot (red) is the plot of the values of the years 1997-2010 plus the forecasted values of 2011.For the first model what I used to calculate the trend is a linear expression. The trend and the seasonals are the following:Trend = 2.369.914 + 3.390*tSeasonals 0,96; 0,87; 1,099; 0,906; 0,883; 1,068; 0,88; 0,873; 1,095; 0,933; 0,997; 1,434

The graphs below show the actual and the predicted sales using the model. As we can see this model doesn’t fit well. As the trend is calculated with a linear expression we see that as we have a positive slope the predicted sales are increasing which it could not be true. It doesn’t take into account variations of the sales. The correlation of the actual and predicted sales is 0,89:

t Forecast Upper limit Lower limit170 2.825.290 3.198.692 2.451.887171 2.562.311 2.935.713 2.188.908172 3.241.974 3.615.377 2.868.572173 2.674.689 3.048.091 2.301.286174 2.610.477 2.983.880 2.237.074175 3.159.758 3.533.160 2.786.355176 2.607.009 2.980.412 2.233.606177 2.589.445 2.962.848 2.216.042178 3.253.201 3.626.603 2.879.798179 2.775.117 3.148.520 2.401.715180 2.968.694 3.342.097 2.595.291181 4.273.888 4.647.290 3.900.485

1

Page 2: PascualMariaEugenia María Eugenia Pascual

Jia Wu goes to UK María Eugenia Pascual

For the second one I’ve calculated the trend using a parabola. As we can see from the graphs this model fits better than the linear one. The correlation of the actual and predicted sales is 0,97, which means that the model used is better than the previous one. The trend and the seasonals are the following:

Trend = 2.037.603 + 15.116*t - 69*t2Seasonals 0,961; 0,871; 1,1; 0,906; 0,883; 1,068; 0,88; 0,873; 1,095; 0,933; 0,997; 1,434

t Forecast Upper limit Lower limit170 2.507.478 2.725.785 2.289.170171 2.265.733 2.484.041 2.047.426172 2.853.141 3.071.449 2.634.834173 2.341.249 2.559.557 2.122.942174 2.274.842 2.493.149 2.056.534175 2.742.607 2.960.915 2.524.300176 2.251.647 2.469.955 2.033.339177 2.225.228 2.443.535 2.006.920178 2.780.709 2.999.017 2.562.402179 2.360.407 2.578.714 2.142.099180 2.512.103 2.730.411 2.293.796181 3.599.128 3.817.436 3.380.821

If we compare both approaches (the linear and the quadratic) we see that the higher the order of the model the better it fits. At this point we could argue that if we keep increasing the number of parameters used to calculate the trend then our model will be better and therefore we will be able to forecast better future samples. If we try this then we realize that that is not true and that what actually happens is that the model developed is not useful to predict new data. By doing this I realized that if the number of parameters is the same as or greater than the number of observations (in this case 12 months), a simple model like ours can perfectly predict the training data simply by memorizing the training data in its entirety, but this model will fail drastically when making predictions about new or unseen data, since the simple model has not learned to generalize at all. This problem is called overfitting and we can see an example in the picture below where the actual sales are perfectly predicted and the forecasted are not:

2

Page 3: PascualMariaEugenia María Eugenia Pascual

Jia Wu goes to UK María Eugenia Pascual

Finally, the method I chose and used is the Holt-Winters. This method allow us to overcome two problems: the rigidity of the trend based on a mathematical formula and that the function may fit adequately the actual data but it may be inadequate for future data. To do so this method conlinously updates both the slope and the seasonals for each new observation but in forecasting the future values it uses the last slope available and the last values of the seasonals.The graphs below show the results for this method using α, β and γ equal to 0,2. The correlation of the actual and predicted sales in this case is 0,98 which means that this model is better than the quadratic one:

We can try to optimize the values of α, β and γ in order to obtain an accurate model. If we use solver to minimize the MSE we obtain the optimized values of α, β and γ which are 0,42, 0,01 and 0,998. For this values the correlation of the actual and predicted sales is 1.

3

Page 4: PascualMariaEugenia María Eugenia Pascual

Jia Wu goes to UK María Eugenia Pascual

HW forecast

Month Forecast1 2.126.244,392 2.157.113,683 2.826.736,334 2.299.128,855 2.299.274,096 2.718.239,027 2.271.977,678 2.253.710,209 2.790.934,92

10 2.348.868,8511 2.467.583,2312 3.474.452,88

4