Seasonal Trend Decomposition in R

January 11th, 2013

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is available within R via the stl function.

The use of the stl function can be demonstrated using one of the data sets available within the base R installation. The well used nottem data set (Average Monthly Temperatures at Nottingham, 1920-1939) is a good starting point. The data itself is presented here:

> nottem
      Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
1920 40.6 40.8 44.4 46.7 54.1 58.5 57.7 56.4 54.3 50.5 42.9 39.8
1921 44.2 39.8 45.1 47.0 54.1 58.7 66.3 59.9 57.0 54.2 39.7 42.8
1922 37.5 38.7 39.5 42.1 55.7 57.8 56.8 54.3 54.3 47.1 41.8 41.7
1923 41.8 40.1 42.9 45.8 49.2 52.7 64.2 59.6 54.4 49.2 36.3 37.6
1924 39.3 37.5 38.3 45.5 53.2 57.7 60.8 58.2 56.4 49.8 44.4 43.6
1925 40.0 40.5 40.8 45.1 53.8 59.4 63.5 61.0 53.0 50.0 38.1 36.3
1926 39.2 43.4 43.4 48.9 50.6 56.8 62.5 62.0 57.5 46.7 41.6 39.8
1927 39.4 38.5 45.3 47.1 51.7 55.0 60.4 60.5 54.7 50.3 42.3 35.2
1928 40.8 41.1 42.8 47.3 50.9 56.4 62.2 60.5 55.4 50.2 43.0 37.3
1929 34.8 31.3 41.0 43.9 53.1 56.9 62.5 60.3 59.8 49.2 42.9 41.9
1930 41.6 37.1 41.2 46.9 51.2 60.4 60.1 61.6 57.0 50.9 43.0 38.8
1931 37.1 38.4 38.4 46.5 53.5 58.4 60.6 58.2 53.8 46.6 45.5 40.6
1932 42.4 38.4 40.3 44.6 50.9 57.0 62.1 63.5 56.3 47.3 43.6 41.8
1933 36.2 39.3 44.5 48.7 54.2 60.8 65.5 64.9 60.1 50.2 42.1 35.8
1934 39.4 38.2 40.4 46.9 53.4 59.6 66.5 60.4 59.2 51.2 42.8 45.8
1935 40.0 42.6 43.5 47.1 50.0 60.5 64.6 64.0 56.8 48.6 44.2 36.4
1936 37.3 35.0 44.0 43.9 52.7 58.6 60.0 61.1 58.1 49.6 41.6 41.3
1937 40.8 41.0 38.4 47.4 54.1 58.6 61.4 61.8 56.3 50.9 41.4 37.1
1938 42.1 41.2 47.3 46.6 52.4 59.0 59.6 60.4 57.0 50.7 47.8 39.2
1939 39.4 40.9 42.4 47.8 52.4 58.0 60.7 61.8 58.2 46.7 46.6 37.8

We can try and run stl by specifying the data frame only but R returns an error message:

> stl(nottem)
Error in stl(nottem) : argument "s.window" is missing, with no default

Looking at the help pages we see the following information for the s.window argument: either the character string “periodic” or the span (in lags) of the loess window for seasonal extraction, which should be odd. so if we work with the periodic option we now find that R runs happily:

> nottem.stl = stl(nottem, s.window="periodic")

Now that we have the STL decomposition there is a plot function provided for the object created from a call to stl.

> plot(nottem.stl)

The graph looks like this:

STL Decomposition of Nottingham Temperature Time Series

STL Decomposition of Nottingham Temperature Time Series

The four graphs are the original data, seasonal component, trend component and the remainder and this shows the periodic seasonal pattern extracted out from the original data and the trend that moves around between 47 and 51 degrees Fahrenheit. There is a bar at the right hand side of each graph to allow a relative comparison of the magnitudes of each component. For this data the change in trend is less than the variation doing to the monthly variation.

9 Responses to “Seasonal Trend Decomposition in R”

  1. Erik Longa says:

    I have analysed a long time series of mean vegetation response values over various areas. (25 years twice monthly, periodicity freq=24) mainly to discover trend. All plots show very regular annual periodicity; summer response low, winter response high.
    I have similarly created stl plots for rainfall (25 years, monthly rain, freq=12). Which showed little seasonal as expected; rain is considered aseasonal in this region.

    I found the command: summary(stl)
    of the plots very useful as this shows the % of the seasonal, which all made sense until one of my regions showed seasonal greater than 100%.

    I have been using lines such as:
    “The analysis revealed that 76% of the data are expressed in the seasonal component”
    “The seasonal component represented 91% of the data.”
    “The stl plot showed strong seasonality at 92%”
    “STL analysis of local rainfall indicated that only 6% of rainfall variability over the 25 years was attributable to seasonality, which confirms that rainfall in this desert region is aseasonal”. Am I using this information correctly?
    Are the above statements valid, or should they be worded differently?
    The seasonal percentage in the summary is the percentage of what? I mean how can I best explain >100%?

    Sorry bothering you with this. I have been looking all over the web and in the help menus for an answer, but have been unable to find a reference to the meaning of the % seasonal in the summary(stl) output.
    Thanks, Erik

  2. Ralph says:

    Erik,

    When you look at the code for summary.stl it appears that the % values are related to inter-quartile range calculations (IQR) for the trend, seasonality, remainder and original data. The IQR for each of the three components is reported as a % of the IQR for the data so it is not a sum of squares decomposition as I think is how you would like to interpret the value?

    These values correspond to the bars on the right hand side of the four plots which show the relative contribution of the the components.

    Hope this helps,
    Ralph

    summary.stl <- function(object, digits = getOption("digits"), ...)
    {
        cat(" Call:\n ")
        dput(object$call, control=NULL)
        cat("\n Time.series components:\n")
        print(summary(object$time.series, digits = digits, ...))
        cat(" IQR:\n")
        iqr <- apply(cbind(STL = object$time.series,
                           data = object$time.series %*% rep(1,3)),
    		 2L, IQR)
        print(rbind(format(iqr, digits = max(2, digits - 3)),
    		"   %"= format(round(100 * iqr / iqr["data"], 1))),
    	  quote = FALSE)
        cat("\n Weights:")
        if(all(object$weights == 1)) cat(" all == 1\n")
        else {
            cat("\n"); print(summary(object$weights, digits = digits, ...))
        }
        cat("\n Other components: ")
        str(object[-(1L:3)], give.attr = FALSE)
        invisible(object)
    }
  3. Christian says:

    Hi all,

    Using STL, how can I estimate the expected value from a time-series data? Is it correct to calculate mean of the fitted values? I need to obtain this value to set a a threshold value for epidemic warning. The threshold value = expected value + 2SD.

    Thanks,

  4. Lucia says:

    Hi!! it is possible to graph only the trend? because i wanna compare the trend in three time series and i wanna put the three tendencies together to see how it´s behaviour. I don´t know if this is possible (This is my first analysis with R) Thank you!!!!

  5. Ralph says:

    You can extract the trend using:

    stl(nottem, s.window="periodic")$time.series[,"trend"]

    This can be done for each of the time series and combined to produce the require display.

  6. Uyen says:

    How can I decomposing seasonal and trend by using daily data like stock price?

  7. Ralph says:

    You will need to create a time series object and then decide what period would want to use. The function ts should be your starting point.

    > class(nottem)
    [1] "ts"
  8. Antonello Pareto says:

    Sorry to bother you, but I have a problem with stl.
    I’m trying to use it on a time series including monthly deaths by violence (absolute number). The object is a ts (I’ve tested with is.ts()). The object (call it ‘deaths’) is something like this:
    Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
    1969 177 160 177 173 209 201 188 153 147 157 131 140
    1970 177 161 169 193 207 189 228 180 182 152 128 153

    When I use stl, I receive this message ” ammesse solo serie unidimensionali” (it is in Italian language, it souds like that: error, only unidimensional series are admitted)
    I cannot figure out what is the problem. The series does not differ from nottem, except it is without decimals.
    Thank you in advance,
    Antonello

  9. Antonello Pareto says:

    I’ve forgot to mention that the series ‘deaths’ include 35 years

Leave a Reply