The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is available within R via the **stl** function.

The use of the **stl** function can be demonstrated using one of the data sets available within the base R installation. The well used *nottem* data set (Average Monthly Temperatures at Nottingham, 1920-1939) is a good starting point. The data itself is presented here:

> nottem Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1920 40.6 40.8 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8 1921 44.2 39.8 45.1 47.0 54.1 58.7 66.3 59.9 57.0 54.2 39.7 42.8 1922 37.5 38.7 39.5 42.1 55.7 57.8 56.8 54.3 54.3 47.1 41.8 41.7 1923 41.8 40.1 42.9 45.8 49.2 52.7 64.2 59.6 54.4 49.2 36.3 37.6 1924 39.3 37.5 38.3 45.5 53.2 57.7 60.8 58.2 56.4 49.8 44.4 43.6 1925 40.0 40.5 40.8 45.1 53.8 59.4 63.5 61.0 53.0 50.0 38.1 36.3 1926 39.2 43.4 43.4 48.9 50.6 56.8 62.5 62.0 57.5 46.7 41.6 39.8 1927 39.4 38.5 45.3 47.1 51.7 55.0 60.4 60.5 54.7 50.3 42.3 35.2 1928 40.8 41.1 42.8 47.3 50.9 56.4 62.2 60.5 55.4 50.2 43.0 37.3 1929 34.8 31.3 41.0 43.9 53.1 56.9 62.5 60.3 59.8 49.2 42.9 41.9 1930 41.6 37.1 41.2 46.9 51.2 60.4 60.1 61.6 57.0 50.9 43.0 38.8 1931 37.1 38.4 38.4 46.5 53.5 58.4 60.6 58.2 53.8 46.6 45.5 40.6 1932 42.4 38.4 40.3 44.6 50.9 57.0 62.1 63.5 56.3 47.3 43.6 41.8 1933 36.2 39.3 44.5 48.7 54.2 60.8 65.5 64.9 60.1 50.2 42.1 35.8 1934 39.4 38.2 40.4 46.9 53.4 59.6 66.5 60.4 59.2 51.2 42.8 45.8 1935 40.0 42.6 43.5 47.1 50.0 60.5 64.6 64.0 56.8 48.6 44.2 36.4 1936 37.3 35.0 44.0 43.9 52.7 58.6 60.0 61.1 58.1 49.6 41.6 41.3 1937 40.8 41.0 38.4 47.4 54.1 58.6 61.4 61.8 56.3 50.9 41.4 37.1 1938 42.1 41.2 47.3 46.6 52.4 59.0 59.6 60.4 57.0 50.7 47.8 39.2 1939 39.4 40.9 42.4 47.8 52.4 58.0 60.7 61.8 58.2 46.7 46.6 37.8 |

We can try and run **stl** by specifying the data frame only but **R** returns an error message:

> stl(nottem) Error in stl(nottem) : argument "s.window" is missing, with no default |

Looking at the help pages we see the following information for the *s.window* argument: *either the character string “periodic” or the span (in lags) of the loess window for seasonal extraction, which should be odd.* so if we work with the *periodic* option we now find that R runs happily:

> nottem.stl = stl(nottem, s.window="periodic") |

Now that we have the STL decomposition there is a plot function provided for the object created from a call to **stl**.

> plot(nottem.stl) |

The graph looks like this:

The four graphs are the original data, seasonal component, trend component and the remainder and this shows the periodic seasonal pattern extracted out from the original data and the trend that moves around between 47 and 51 degrees Fahrenheit. There is a bar at the right hand side of each graph to allow a relative comparison of the magnitudes of each component. For this data the change in trend is less than the variation doing to the monthly variation.

I have analysed a long time series of mean vegetation response values over various areas. (25 years twice monthly, periodicity freq=24) mainly to discover trend. All plots show very regular annual periodicity; summer response low, winter response high.

I have similarly created stl plots for rainfall (25 years, monthly rain, freq=12). Which showed little seasonal as expected; rain is considered aseasonal in this region.

I found the command: summary(stl)

of the plots very useful as this shows the % of the seasonal, which all made sense until one of my regions showed seasonal greater than 100%.

I have been using lines such as:

“The analysis revealed that 76% of the data are expressed in the seasonal component”

“The seasonal component represented 91% of the data.”

“The stl plot showed strong seasonality at 92%”

“STL analysis of local rainfall indicated that only 6% of rainfall variability over the 25 years was attributable to seasonality, which confirms that rainfall in this desert region is aseasonal”. Am I using this information correctly?

Are the above statements valid, or should they be worded differently?

The seasonal percentage in the summary is the percentage of what? I mean how can I best explain >100%?

Sorry bothering you with this. I have been looking all over the web and in the help menus for an answer, but have been unable to find a reference to the meaning of the % seasonal in the summary(stl) output.

Thanks, Erik

Erik,

When you look at the code for summary.stl it appears that the % values are related to inter-quartile range calculations (IQR) for the trend, seasonality, remainder and original data. The IQR for each of the three components is reported as a % of the IQR for the data so it is not a sum of squares decomposition as I think is how you would like to interpret the value?

These values correspond to the bars on the right hand side of the four plots which show the relative contribution of the the components.

Hope this helps,

Ralph

Hi all,

Using STL, how can I estimate the expected value from a time-series data? Is it correct to calculate mean of the fitted values? I need to obtain this value to set a a threshold value for epidemic warning. The threshold value = expected value + 2SD.

Thanks,

Hi!! it is possible to graph only the trend? because i wanna compare the trend in three time series and i wanna put the three tendencies together to see how it´s behaviour. I don´t know if this is possible (This is my first analysis with R) Thank you!!!!

You can extract the trend using:

This can be done for each of the time series and combined to produce the require display.

How can I decomposing seasonal and trend by using daily data like stock price?

You will need to create a time series object and then decide what period would want to use. The function ts should be your starting point.