This tutorial provides an example of a catalog company, which is interested in
forecasting monthly sales of its men's clothing line, based on 10 years of their sales
data.
In the Forecast bandwidth utilization tutorial, you learned how the Expert Modeler can
decide which is the most appropriate model for your time series. Now, it's time to take a closer
look at the two methods that are available when you choose a model: exponential smoothing and
ARIMA.
To help you decide on an appropriate model, it's a good idea to plot the time series first.
Visual inspection of a time series can often be a powerful guide in helping you choose. In
particular, you need to ask yourself:
Does the series have an overall trend? If so, does the trend appear constant or does it appear
to be dying out with time?
Does the series show seasonality? If so, do the seasonal fluctuations seem to grow with time or
do they appear constant over successive periods?
Preview the tutorial
Copy link to section
Watch this video to preview the steps in this tutorial. There might
be slight differences in the user interface that is shown in the video. The video is intended to be
a companion to the written tutorial. This video provides a visual method to learn the concepts and
tasks in this documentation.
This tutorial uses the Forecasting Catalog Sales flow in the sample project. The data file
used is catalog_seasfac.csv. The following image shows the sample modeler flow.
Figure 1. Sample modeler flow
The following image shows the sample data set.Figure 2. Sample data set
Task 1: Open the sample project
Copy link to section
The sample project contains several data sets and sample modeler flows. If you don't already have
the sample project, then refer to the Tutorials topic to create the sample project. Then follow these steps to open the sample
project:
In watsonx, from the Navigation menu, choose
Projects > View all Projects.
Click SPSS Modeler Project.
Click the Assets tab to see the data sets and modeler flows.
Check your progress
The following image shows the project Assets tab. You are now ready to work with the sample
modeler flow associated with this tutorial.
Follow these steps to use a Time plot node to visualize the data:
Add a Time plot node:
In the node palette, expand the Graphs section.
Drag the Time plot node onto the canvas.
Connect the Type node to the new Time plot node.
Double-click the Time plot node to set its properties.
In the Series section, click Add columns.
Select the men field.
Click OK.
Select Use custom x axis field label.
For the X axis label, select date.
Clear the Normalize option.
Click Save.
Hover over the [men] v. date node, and click the Run icon .
In the Outputs and models pane, click the results with the name [men] v. date to
view the graph.
The series shows a general upward trend; that is, the series values tend to
increase over time. The upward trend is seemingly constant, which indicates a linear
trend.
The series also has a distinct seasonal pattern with annual highs in December, as
indicated by the vertical lines on the graph. The seasonal variations appear to grow with the upward
series trend, which suggests multiplicative rather than additive seasonality.
Now that you've
identified the characteristics of the series, you're ready to try modeling it. The exponential
smoothing method is useful for forecasting series that exhibit trend, seasonality, or both. As
previously seen, this data exhibits both characteristics.
Check your progress
The following image shows a graph. You are now ready to build the model.
Building a best-fit exponential smoothing model involves determining the model type (whether the
model needs to include trend, seasonality, or both) and then obtaining the best-fit parameters for
the chosen model.
The plot of men's clothing sales over time suggested a model with both a linear trend component
and a multiplicative seasonality component. This implies a Winters' model. First, however, you
explore a simple model (no trend and no seasonality) and then a Holt's model (incorporates linear
trend but no seasonality). This will give you practice in identifying when a model is not a good fit
to the data, an essential skill in successful model building.
Follow these steps to build a simple exponential smoothing model:
Double-click the Men (Time Series) node to view its properties.
Expand the Observations and time interval section, and set these properties:
Verify that the Time/date is set to date.
Verify that the Time Interval is set to Months.
Expand the Build options - general section, and set these properties:
Verify that the Method is set to Exponential Smoothing.
Verify that the Model Type is set to Simple.
Click Save.
Click Run all .
In the Outputs and models pane, click the output results with the name Time plot of
[men $TS-men] v. date to view the graph.
The men plot represents the
actual data, while $TS-men denotes the time series model.Figure 3. Simple exponential smoothing model
Although the simple model does, in fact, exhibit a gradual (and rather
ponderous) upward trend, it takes no account of seasonality. You can safely reject this
model.
Now try a Holt's linear model. This should at least model the trend better than the
simple model, although it is also unlikely to capture the seasonality.
Double-click the Men (Time Series) node and set these properties:
Expand the Build options - general section.
Set the Model Type to Holt's linear trend.
Click Save.
Click Run all .
In the Outputs and models pane, click the output results with the name Time plot of
[men $TS-men] v. date to view the graph.
Holt's model displays a smoother upward trend than
the simple model, but it still takes no account of the seasonality, so you can disregard this one
too.
Figure 4. Holt's linear trend model
You may recall that the initial plot of men's clothing sales over time suggested a
model incorporating a linear trend and multiplicative seasonality. A more suitable candidate,
therefore, might be Winters' model.
Double-click the Men (Time Series) node and set these properties:
Expand the Build options - general section.
Set the Model Type to Winters' multiplicative.
Click Save.
Click Run all .
In the Outputs and models pane, click the output results with the name Time plot of
[men $TS-men] v. date to view the graph.
This looks better. The model reflects both the trend
and the seasonality of the data. The dataset covers a period of 10 years and includes 10 seasonal
peaks occurring in December of each year. The 10 peaks present in the predicted results match up
well with the 10 annual peaks in the real data.
However, the results also underscore the
limitations of the Exponential Smoothing procedure. Looking at both the upward and downward spikes,
there is significant structure that's not accounted for.
If you're primarily interested in
modeling a long-term trend with seasonal variation, then exponential smoothing may be a good choice.
To model a more complex structure such as this one, you need to consider using the ARIMA
procedure.
Figure 5. Winters' multiplicative
model
Check your progress
The following image shows the flow. You are now ready to build an ARIMA model.
With the ARIMA procedure, you can create an autoregressive integrated moving-average (ARIMA)
model that is suitable for finely tuned modeling of time series.
ARIMA models provide more sophisticated methods for modeling trend and seasonal components than
do exponential smoothing models, and they have the added benefit of being able to include predictor
variables in the model.
Continuing the example of the catalog company that wants to develop a forecasting model, you have
seen how the company has collected data on monthly sales of men's clothing along with several series
that might be used to explain some of the variation in sales. Possible predictors include the number
of catalogs mailed and the number of pages in the catalog, the number of phone lines open for
ordering, the amount spent on print advertising, and the number of customer service
representatives.
Are any of these predictors useful for forecasting? Is a model with predictors really better than
one without? Using the ARIMA procedure, you can create a forecasting model with predictors, and see
if there's a significant difference in predictive ability over the exponential smoothing model with
no predictors.
With the ARIMA method, you can fine-tune the model by specifying orders of autoregression,
differencing, and moving average, along with seasonal counterparts to these components. Determining
the best values for these components manually can be a time-consuming process involving a good deal
of trial and error so, for this example, you specify for the Expert Modeler to choose an ARIMA model
for you.
Next, you build a better model by treating some of the other variables in the dataset as
predictor variables. The ones that seem most useful to include as predictors are the number of
catalogs mailed (mail), the number of pages in the catalog (page),
the number of phone lines open for ordering (phone), the amount spent on print
advertising (print), and the number of customer service representatives
(service).
Follow these steps to build an ARIMA model:
Double-click the Type node to set its properties.
Verify that the role for mail, page, phone, print, and
service fields are set to Input.
Verify that the role for men is set to Target .
Set the Role for all of the remaining fields to None.
Click Save.
Double-click the Men (Time Series) node and set these properties:
Expand the Build options - general section.
Set the Method to Expert Modeler.
Set the Model Type to ARIMA models only.
Select the Expert Modeler considers seasonal models option.Figure 6. Choose only ARIMA models
Click Save.
Click Run all .
In the Outputs and models pane, click the model with the name men to view the
model details.
On the Models page, click men in the Target column.
Click the Model Information page. Notice how the Expert Modeler has chosen only two of
the five specified predictors as being significant to the model.Figure 7. Expert Modeler chooses two predictors
Close the two model windows.
In the Outputs and models pane, click the output results with the name Time plot of
[men $TS-men] v. date to view the graph.
This model improves on the previous one by capturing
the large downward spike as well, making it the best fit so far.
Figure 8. ARIMA model with predictors specified
Next, you can refine the model even further, but any improvements from this point
on are likely to be minimal. You've established that the ARIMA model with predictors is preferable,
so use this model to forecast sales for the coming year.
Close the graph window.
Double-click the Men (Time Series) node and set these properties:
Expand the Model options section.
Select the Extend records into the future option, and set the value to
12.
Select the Compute future values of inputs option.
Click Save.
Click Run all .
In the Outputs and models pane, click the output results with the name Time plot of
[men $TS-men] v. date to view the graph.
The forecast looks good. As expected, there's a
return to normal sales levels following the December peak, and a steady upward trend in the second
half of the year, with sales in general better than those for the previous year.
Check your progress
The following image shows a graph using the ARIMA model.
You've successfully modeled a complex time series, incorporating not only an upward trend but
also seasonal and other variations. You've also seen how, through trial and error, you can get
closer and closer to an accurate model, which you can then use to forecast future sales.
In practice, you would need to reapply the model as your actual sales data are updated; for
example, every month or every quarter, and produce updated forecasts.
About cookies on this siteOur websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising.For more information, please review your cookie preferences options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.