You and your group members are part of a team of analysts working for an energy consultancy, where you cover the crude oil market. You produce reports and research papers for your clients, which are large companies whose profits are sensitive to oil prices. One of the regular reports you write each month provides a forecast for the Brent crude oil price in the following month. Recently, several clients have asked for details on the methodology underlying those forecasts. (Since these companies make important decisions based on oil price forecasts, they need to be reassured about the rigour of the forecasting approach.) In response to these requests, your boss has asked you to write a report that explains your modelling approach and applies it to the problem of forecasting monthly Brent crude oil prices.
The modelling approach you follow is known as the Box-Jenkins methodology, introduced by Box and Jenkins (1976). It is a formal procedure for identifying, estimating and testing an ARIMA(p,d,q) model, given a sample of time series data, and comprises the following three stages, as illustrated by Figure 1:
- Model identification.
- Model estimation.
- Diagnostic testing.
In stage 1, the objective is to identify plausible values for the ARIMA orders p, d and q. This stage consists of the following steps:
- Plot the time series and inspect it for evidence of non-stationarity.
- Plot the ACF for the series and inspect it for evidence of non-stationary.
- Perform a DF test and/or an ADF test on the series, to determine whether it is stationary or unit-root non-stationary.
Figure 1: The Box-Jenkins methodology.
- If the previous steps indicate that the series is non-stationary, then replace it with its first difference and return to step (a), otherwise continue to step (e). Based on how many times you repeat steps (a)–(c), you will have a candidate value for the order of integration d.
- You should now have a series that is presumed to be stationary. Plot its ACF and PACF and inspect them for evidence that the series fits an AR(p) model, an MA(q) model, or an ARMA(p,q) model.
- Use the AIC and BIC to obtain candidate values for the autoregressive and moving average orders p and q. Compare them with the values for p and q inferred in the previous step.
By the end of the stage 1, you should have a candidate ARIMA(p,d,q) model for the series. In stage 2, you estimate that model:
- Use MLE to fit the chosen model to the series, and write down the equation for the fitted model.
- Use CSS to fit the chosen model to the series, and write down the equation for the fitted model.
- Compare the two estimated models for consistency.
At the end of stage 2 you should have an ARIMA(p,d,q) that has been fitted to the data. Stage 3 is concerned with a critical analysis of that model. It consists of the following steps:
- Plot the residual series from the estimated model and inspect the graph for evidence of non-stationarity.
- Test the adequacy of the fitted model by plotting the ACF for its residuals and performing t-tests on the residual autocorrelations.
- Perform LB tests on the residual autocorrelations from the fitted model to further test its adequacy.
- Perform t-tests on the coefficient estimates for the the fitted model. If there are redundant coefficients, then estimate a reduced model that eliminates them and return to step (a).
- Over-fit the model by estimating a similar model with slightly higher autoregressive and/or moving average orders. The over-fitted model should contain redundant coefficients if the original model provides the best fit to the data.
- Under-fit the model by estimating a similar model with slightly lower autoregressive and/or moving average orders. The under-fitted model should be inadequate if the original model provides the best to the data.
If you have a fitted model that passes the diagnostic tests, it can be used for forecasting. If not, you should return to stage 1 and try to identify a better model.
The text file Brent Crude Oil.txt contains monthly observations for the price of Brent crude oil, in U.S. dollars per barrel, over the period from 1 January 1990 to 1 July 2020. The data was compiled by the International Monetary Fund and was downloaded from the Federal Reserve Bank of St. Louis. Split the series into an estimation sample, covering the period from 1 January 1990 to 1 January 2018, and a forecasting sample, covering the remaining time. Use the estimation sample to identify, estimate and test an ARIMA model for the Brent crude oil price, by following the Box-Jenkins methodology. Then use the fitted model to forecast the monthly price of Brent crude oil over the forecast period. Plot the actual Brent crude oil price over the period from 1 January 2016 to 1 July 2020, and overlay the forecasts and 95% forecast confidence intervals, for the forecasting period, on the same graph. On a separate graph, plot the percentage forecast error over the forecasting period, and assess the quality of the forecasts. Finally, perform a t-test to ascertain whether the average forecast error over the forecasting period is statistically different from zero.
The analysis above should be presented in a 10-page (maximum) report to your clients. Remember that the objective is to reassure them that your forecasting approach is rigorous and reliable, so the report should provide a clear account of your forecasting methodology. However, since they are business professionals, not econometricians, it is important to pitch your explanations at an appropriate level.
Your clients pay a lot of money for your consulting services, so you should regard the report as a marketing document designed to convince them to continue using your services. It could also be used to pitch your services to potential new clients. With that in mind, the professionalism, structure and presentation quality of the report are very important. Finally, be sure to cite any articles or websites used.
Box, G. E. P. and G. M. Jenkins (1976). Time Series Analysis: Forecasting and Control. Holden-Day.