Design of Experiments – Software for Exploratory Data Analysis and Statistical Modelling

Design of Experiments: General Block Design

Ralph — Sat, 12 Oct 2013 19:49:24 +0000

In some experiments, where the aim is to compare a set of treatments, there are one or two sources of variation that can be accounted for at the design stage of a study. The statistical technique that is used in these situation is blocking and it can be used to reduce the variance of pairwise treatment comparisons.

When designing an experiment with a single blocking factor, a randomised block design (RBD) can be used if there are sufficient resources to investigated all treatments within each of the blocks of the starting design. When this is not possible an incomplete block design is required and there are various incomplete block designs that are available for specific combinations of design parameters.

The general block design investigates a set of v treatments allocated to n experimental units across b blocks. Each of the blocks contains k units and it is not the case that the size of each block has to be the same. However this is a desirable feature as it contributes towards the balance of the design to ensure that all treatment comparisons are made with the same or similar precision.

Design of Experiments: Blocking, Confounding and Interactions

Ralph — Fri, 27 Sep 2013 11:12:50 +0000

In a previous post we considered some general points about experimental design. In this post we will look at some other common considerations when planning an experiment, specifically blocking, confounding and interactions.

Blocking: The idea behind blocking is to reduce the impact of uncontrolled variations on the experimental units. There are various examples of blocks including experiments on different machines, different operators or multiple days of the week. We would want to allocate our treatments across these nuisance factors or it may not be possible to directly compare our treatments because the differences cannot be separated from the effect of the nuisance factors.

Confounding: The term confounding is related to blocking as it describes the situation where the effect of two factors cannot be separated from each other. In the design this can be seen by them always varying together. In the simple case of a two level factorial experiment where each factor can be set at a low or high value then if the factors appear together only at low/low or high/high then they would be confounded as we cannot separate out which factor is causing any change.

Interactions: The term interaction refers to the joint effect of two (or more) factors on the output of a system. Here we cannot consider the main effects of the factors separately as the main effects and interaction need to be considered as a whole to describe the relationship between input and outputs.

These issues are reasonably straightforward to visualise for small designs but it rapidly becomes more complex as the number of factors increases.

Design of Experiments: General Background

Ralph — Sun, 02 Jun 2013 19:35:54 +0000

The statistical methodology of design of experiments has a long history starting back with the work of Fisher, Yates and other researchers. One of the main motivating factors is to make good use of available resources and to avoid making decisions that cannot be corrected during the analysis stage of an investigation.

The statistical methodology is based on a systematic approach to investigate the causes of variation of a system of interest and to control the factors that can be while taking some account of nuisance factors that can be measured but not controlled by an experimenter. As with most things there are some general principles and common considerations for experiments run in a variety of different areas.

The following considerations are required for running an experiment:

Absence of Systematic Error: when running an experiment the aim is to obtain a correct estimate of the metric of interest, e.g. treatment effect or difference. The design selected should avoid the introduction of bias into the subsequent analysis.
Adequate Precision: an experimental design is chosen to allow estimation and comparison of effects of interest, e.g. differences between treatments, so there should be sufficient replication in the experiment to allow these effects to be precisely estimated and also for meaningful differences to be detected.
Range of Validity: the range of variables considered in the experiment so cover the range of interest so that the results can be generalised without need to rely on extrapolation.
Simplicity: ideally the choice of design should be simple to implement to ensure that it can be run as intended and to reduce the chance of missing data which could impact on the analysis of the results.

There are various objectives from using design of experiments methodology and these include (a) screening of a large number of factors to reduce this set to a more manageable subset that can be investigated in greater detail, (b) response surface methodology to understand the behaviour or a system and (c) optimisation of a process or reduction of uncontrollable variation (or noise) in a process.

There are not a large number of packages relating to design of experiments in R but those of interest are covered by the Task View on CRAN.

Fractional Factorial Designs using FrF2

Ralph — Wed, 18 May 2011 18:17:18 +0000

The FrF2 package for R can be used to create regular and non-regular Fractional Factorial 2-level designs. It is reasonably straightforward to use.

First step is to install the package then make it available for use in the current session:

require(FrF2)

A basic call to the main functino FrF2 specifies the number of runs in the fractional factorial design (which needs to be a multiple of 2) and the number of factors. For example a three factor design would have a total of eight runs if it was a full factorial but if we wanted to go with four runs then we can generate the design like this:

> FrF2(4, 3)
   A  B  C
1  1 -1 -1
2 -1  1 -1
3 -1 -1  1
4  1  1  1
class=design, type= FrF2

The default output labels the factors A, B, C and so on and the factor levels are -1 and +1 for the two levels of each factor. We can change the level names to low and high using the default.levels function argument:

> FrF2(4, 3, default.levels = c("low", "high"))
     A    B    C
1 high high high
2  low high  low
3 high  low  low
4  low  low high
class=design, type= FrF2

The factors can be specified as a list of names rather than the number of factors via the factor.names argument:

> FrF2(4, factor.names = c("One", "Two", "Three"),
  default.levels = c("low", "high"))
   One  Two Three
1  low high   low
2 high high  high
3  low  low  high
4 high  low   low
class=design, type= FrF2

These are the basics and there are other features for greater control over the confounding between factors and their interactions that is introduced by using a fractional factorial design.

Generating Balanced Incomplete Block Designs (BIBD)

Ralph — Fri, 16 Jul 2010 12:04:50 +0000

The Balanced Incomplete Block Design (BIBD) is a well studied experimental design that has various desirable features from a statistical perspective. The crossdes package in R provides a way to generate a block design for some given parameters and test wheter this design satisfies the BIBD conditions.

For a BIBD there are v treatments repeated r times in b blocks of k observations. There is a fifth parameter lambda that records the number of blocks where every pair of treatment occurs in the design.

We first load the crossdes package in our sessions:

require(crossdes)

The function find.BIB is used to generate a block design with specific number of treatments, blocks (rows of the design) and elements per block (columns of the design).

Consider an example with five treatments in four blocks of three elements. We can create a block design via:

> find.BIB(5, 4, 3)
     [,1] [,2] [,3]
[1,]    1    3    4
[2,]    2    4    5
[3,]    2    3    5
[4,]    1    2    5

This design is not a BIBD because the treatments are not all repeated the same number of times in the design and we can check this with the isGYD function. For this example:

> isGYD(find.BIB(5, 4, 3))

[1] The design is neither balanced w.r.t. rows nor w.r.t. columns.

This confirms what we can see from the design.

Let us instead consider a design with seven treatments and seven blocks of three elements to see whether we can create a BIBD with these parameters:

> my.design = find.BIB(7, 7, 3)
> my.design
     [,1] [,2] [,3]
[1,]    1    2    5
[2,]    3    4    5
[3,]    1    3    6
[4,]    2    3    7
[5,]    2    4    6
[6,]    1    4    7
[7,]    5    6    7
> isGYD(my.design)

[1] The design is a balanced incomplete block design w.r.t. rows.

In this situation we are able to generate a valid BIBD experiment with the specified parameters.

Design of Experiments – Block Designs

Ralph — Sat, 20 Feb 2010 20:05:26 +0000

In many experiments where the investigator is comparing a set of treatments there is the possibility of one or more sources of variability in the experimental measurements that can be accounted for during the design stage of the experimentation. For example we might be investigating four different pieces of machinery using say two different operators, who would be expected to display different degrees of competence with the equipment. Or we might not be able to run all of the experimental combinations in one session so we would want to take into account systematic differences that are due to experiments in the various sessions.

The least complicated scenario is where we would have a single (nuisance) factor that we want to control for in the experiment. The statistical model used to describe the data collected in such an experiment could be written in the form:

where there are v treatments in b blocks and the number of units in each block does not have to be the same and is denoted using the k subscript.

In a complete block design all treatments occur the same number of times in every block, usually one replicate of all treatments per block. There will be situations where the number of treatments is too large for all of them to be included in every block of the design. In these situations an incomplete block design would be used for running an experiment.

A special type of design is the balanced incomplete block design (BIBD), where the v treatments are investigated by allocating them to b blocks of equal size k. We have that k is less than t and b and k are chosen so that b * k is a multiple of v. All of the treatments occur exactly r times in the design and every pair of treatments occur together in lambda of the b blocks.

Two-way analysis of variance (ANOVA) is used to analyse data collected from an experiment using a block design, as discussed elsewhere in this post.

Two-way Analysis of Variance (ANOVA)

Ralph — Mon, 15 Feb 2010 21:45:02 +0000

The analysis of variance (ANOVA) model can be extended from making a comparison between multiple groups to take into account additional factors in an experiment. The simplest extension is from one-way to two-way ANOVA where a second factor is included in the model as well as a potential interaction between the two factors.

As an example consider a company that regularly has to ship parcels between its various (five for this example) sub-offices and has the option of using three competing parcel delivery services, all of which charge roughly similar amounts for each delivery. To determine which service to use, the company decides to run an experiment shipping three packages from its head office to each of the five sub-offices. The delivery time for each package is recorded and the data loaded into R:

delivery.df = data.frame(
  Service = c(rep("Carrier 1", 15), rep("Carrier 2", 15),
    rep("Carrier 3", 15)),
  Destination = c(rep(c("Office 1", "Office 2", "Office 3",
    "Office 4", "Office 5"), 9)),
  Time = c(15.23, 14.32, 14.77, 15.12, 14.05,
  15.48, 14.13, 14.46, 15.62, 14.23, 15.19, 14.67, 14.48, 15.34, 14.22,
  16.66, 16.27, 16.35, 16.93, 15.05, 16.98, 16.43, 15.95, 16.73, 15.62,
  16.53, 16.26, 15.69, 16.97, 15.37, 17.12, 16.65, 15.73, 17.77, 15.52,
  16.15, 16.86, 15.18, 17.96, 15.26, 16.36, 16.44, 14.82, 17.62, 15.04)
)

The data is then displayed using a dot plot for an initial visual investigation of any trends in delivery time between the three services and across the five sub-offices. The colour aesthetic is used to distinguish between the three services in the plot.

ggplot(delivery.df, aes(Time, Destination, colour = Service)) + geom_point()

This code produces the following graph:

Graph of the delivery time for different services and destintions

The graph shows a general pattern of service carrier 1 having shorter delivery times than the other two services. There is also an indication that the differences between the services varies for the five sub-offices and we might expect the interaction term to be significant in the two-way ANOVA model. To fit the two-way ANOVA model we use this code:

delivery.mod1 = aov(Time ~ Destination*Service, data = delivery.df)

The * symbol instructs R to create a formula that includes main effects for both Destination and Service as well as the two-way interaction between these two factors. We save the fitted model to an object which we can summarise as follows to test for importance of the various model terms:

> summary(delivery.mod1)
                    Df  Sum Sq Mean Sq  F value    Pr(>F)    
Destination          4 17.5415  4.3854  61.1553 5.408e-14 ***
Service              2 23.1706 11.5853 161.5599 < 2.2e-16 ***
Destination:Service  8  4.1888  0.5236   7.3018 2.360e-05 ***
Residuals           30  2.1513  0.0717                       
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We have strong evidence here that there are differences between the three delivery services, between the five sub-office destinations and that there is an interaction between destination and service in line with what we saw in the original plot of the data. Now that we have fitted the model and identified the important factors we need to investigate the model diagnostics to ensure that the various assumptions are broadly valid.

We can plot the model residuals against fitted values to look for obvious trends that are not consistent with the model assumptions about independence and common variance. The first step is to create a data frame with the fitted values and residuals from the above model:

delivery.res = delivery.df
delivery.res$M1.Fit = fitted(delivery.mod1)
delivery.res$M1.Resid = resid(delivery.mod1)

Then a scatter plot is used to display the fitted values and residuals where the colour asthetic highlights which points correspond to the three competing delivery services:

ggplot(delivery.res, aes(M1.Fit, M1.Resid, colour = Service)) + geom_point() +
  xlab("Fitted Values") + ylab("Residuals")

The xlab() and ylab() are used to change the text on the axis labels. The residual diagnostic plot is:

Diagnostic Residual Plot for Delivery Time Model

There are no obvious patterns in this plot that suggest problems with the two-way ANOVA model that we fitted to the data.

As an alternative display we could separate the residuals into destination sub-offices, where the facet_wrap() function instructs ggplot to create a separate display (panel) for each of the destinations.

ggplot(delivery.res, aes(M1.Fit, M1.Resid, colour = Service)) +
  geom_point() + xlab("Fitted Values") + ylab("Residuals") +
  facet_wrap( ~ Destination)

To produce the following alternative residual plot:

Diagnostic Residual Plot for Delivery Time Model by Destination

No obvious problems in this diagnostic plot.

We could also consider dividing the data by delivery service to get a different view of the residuals:

ggplot(delivery.res, aes(M1.Fit, M1.Resid, colour = Destination)) +
  geom_point() + xlab("Fitted Values") + ylab("Residuals") +
  facet_wrap( ~ Service)

This creates the following graph:

Diagnostic Residual Plot for Delivery Time Model by Service

Again there is nothing substantial here to lead us to consider an alternative analysis.

Lastly we consider the normal probability plot of the model residuals, using the stat_qq() option:

ggplot(delivery.res, aes(sample = M1.Resid)) + stat_qq()

The quantile plot is:

Normal Probability Plot for Delivery Time Model

This plot is very close to the straight line we would expect to observe if the data was a close approximation to a normal distribution. To round off the analysis we look at the Tukey HSD multiple comparisons to confirm that the differences are between delivery service 1 and the other two competing services:

> TukeyHSD(delivery.mod1, which = "Service")
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Time ~ Destination * Service, data = delivery.df)

$Service
                        diff        lwr       upr     p adj
Carrier 2-Carrier 1 1.498667  1.2576092 1.7397241 0.0000000
Carrier 3-Carrier 1 1.544667  1.3036092 1.7857241 0.0000000
Carrier 3-Carrier 2 0.046000 -0.1950575 0.2870575 0.8856246

Even with the multiple comparison post-hoc adjustment there is very strong evidence for the differences that we have consistenly observed throughout the analysis.

We can use ggplot to visualise the difference in mean delivery time for the services and the 95% confidence intervals on these differences. We create a data frame from the TukeyHSD output by extracting the component relating to the delivery service comparison and add the text labels by extracting the row names from the data frame.

delivery.hsd = data.frame(TukeyHSD(delivery.mod1, which = "Service")$Service)
delivery.hsd$Comparison = row.names(delivery.hsd)

We then use the geom_pointrange() to specify lower, middle and upper values based on the three pairwise comparisons of interest.

ggplot(delivery.hsd, aes(Comparison, y = diff, ymin = lwr, ymax = upr)) +
  geom_pointrange() + ylab("Difference in Mean Delivery Time by Service") +
  coord_flip()

The coord_flip() is used to make the confidence intervals horizontal rather than vertical on the graph. This can be confusing for creating the axis labels as we specify the label where it would appear prior to the filp of coordinates. In the example above we add text to the y axis but this now appears on the x axis in the final graph:

Plot of Confidence Intervals for Mean Differences using Tukey HSD

One-way ANOVA (cont.)

Ralph — Fri, 12 Feb 2010 13:45:34 +0000

In a previous post we considered using R to fit one-way ANOVA models to data. In this post we consider a few additional ways that we can look at the analysis.

Fast Tube by Casper

In the analysis we made use of the linear model function lm and the analysis could be conducted using the aov function. The code used to fit the model is very similar:

> plant.mod2 = aov(weight ~ group, data = plant.df)
> summary(plant.mod2)
            Df  Sum Sq Mean Sq F value  Pr(>F)  
group        2  3.7663  1.8832  4.8461 0.01591 *
Residuals   27 10.4921  0.3886                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The output from using the summary function of the fitted model object shows the analysis of variance table with the p-value showing evidence of differences between the three groups. In R we can investigated the particular groups where there are differences using Tukey’s multiple comparisons:

> TukeyHSD(plant.mod2)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = weight ~ group, data = plant.df)

$group
                          diff        lwr       upr     p adj
Treatment 1-Control     -0.371 -1.0622161 0.3202161 0.3908711
Treatment 2-Control      0.494 -0.1972161 1.1852161 0.1979960
Treatment 2-Treatment 1  0.865  0.1737839 1.5562161 0.0120064

The multiple comparison tests highlight that the difference is due to comparing treatments 1 and 2. These 95% confidence intervals for the differences shown above can be plotted:

plot(TukeyHSD(plant.mod2))

which gives

The post-hoc adjustments are recommended as we are testing after looking at the data rather than undertaking a pre-planned analysis.

One-way Analysis of Variance (ANOVA)

Ralph — Wed, 03 Feb 2010 21:01:24 +0000

Analysis of Variance (ANOVA) is a commonly used statistical technique for investigating data by comparing the means of subsets of the data. The base case is the one-way ANOVA which is an extension of two-sample t test for independent groups covering situations where there are more than two groups being compared.

Fast Tube by Casper

In one-way ANOVA the data is sub-divided into groups based on a single classification factor and the standard terminology used to describe the set of factor levels is treatment even though this might not always have meaning for the particular application. There is variation in the measurements taken on the individual components of the data set and ANOVA investigates whether this variation can be explained by the grouping introduced by the classification factor.

As an example we consider one of the data sets available with R relating to an experiment into plant growth. The purpose of the experiment was to compare the yields on the plants for a control group and two treatments of interest. The response variable was a measurement taken on the dried weight of the plants.

The first step in the investigation is to take a copy of the data frame so that we can make some adjustments as necessary while leaving the original data alone. We use the factor function to re-define the labels of the group variables that will appear in the output and graphs:

plant.df = PlantGrowth
plant.df$group = factor(plant.df$group,
  labels = c("Control", "Treatment 1", "Treatment 2"))

The labels argument is a list of names corresponding to the levels of the group factor variable.

A boxplot of the distributions of the dried weights for the three competing groups is created using the ggplot package:

require(ggplot2)

ggplot(plant.df, aes(x = group, y = weight)) +
  geom_boxplot(fill = "grey80", colour = "blue") +
  scale_x_discrete() + xlab("Treatment Group") +
  ylab("Dried weight of plants")

The geom_boxplot() option is used to specify background and outline colours for the boxes. The axis labels are created with the xlab() and ylab() options. The plot that is produce looks like this:

Initial inspection of the data suggests that there are differences in the dried weight for the two treatments but it is not so clear cut to determine whether the treatments are different to the control group. To investigate these differences we fit the one-way ANOVA model using the lm function and look at the parameter estimates and standard errors for the treatment effects. The function call is:

plant.mod1 = lm(weight ~ group, data = plant.df)

We save the model fitted to the data in an object so that we can undertake various actions to study the goodness of the fit to the data and other model assumptions. The standard summary of a lm object is used to produce the following output:

> summary(plant.mod1)

Call:
lm(formula = weight ~ group, data = plant.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0710 -0.4180 -0.0060  0.2627  1.3690 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)        5.0320     0.1971  25.527   <2e-16 ***
groupTreatment 1  -0.3710     0.2788  -1.331   0.1944    
groupTreatment 2   0.4940     0.2788   1.772   0.0877 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.6234 on 27 degrees of freedom
Multiple R-squared: 0.2641,     Adjusted R-squared: 0.2096 
F-statistic: 4.846 on 2 and 27 DF,  p-value: 0.01591

The model output indicates some evidence of a difference in the average growth for the 2nd treatment compared to the control group. An analysis of variance table for this model can be produced via the anova command:

> anova(plant.mod1)
Analysis of Variance Table

Response: weight
          Df  Sum Sq Mean Sq F value  Pr(>F)  
group      2  3.7663  1.8832  4.8461 0.01591 *
Residuals 27 10.4921  0.3886                  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

This table confirms that there are differences between the groups which were highlighted in the model summary. The function confint is used to calculate confidence intervals on the treatment parameters, by default 95% confidence intervals:

> confint(plant.mod1)
                       2.5 %    97.5 %
(Intercept)       4.62752600 5.4364740
groupTreatment 1 -0.94301261 0.2010126
groupTreatment 2 -0.07801261 1.0660126

The model residuals can be plotted against the fitted values to investigate the model assumptions. First we create a data frame with the fitted values, residuals and treatment identifiers:

plant.mod = data.frame(Fitted = fitted(plant.mod1),
  Residuals = resid(plant.mod1), Treatment = plant.df$group)

and then produce the plot:

ggplot(plant.mod, aes(Fitted, Residuals, colour = Treatment)) + geom_point()

which produces this graph:

We can see that there is no major problem with the diagnostic plot but some evidence of different variabilities in the spread of the residuals for the three treatment groups.

Design of Experiments – Blocking and Full Factorial Experimental Design Plans

Ralph — Sun, 06 Dec 2009 15:37:35 +0000

When considering using a full factorial experimental design there may be constraints on the number of experiments that can be run during a particular session, or there may be other practical constraints that introduce systematic differences into an experiment that can be handled during the design and analysis of the data collected during the experiment.

Blocking is a technique used in design of experiments methodology to deal with the systematic differences to ensure that all the factors of interest and interactions between the factors can be assessed in the design. When blocking occurs one or more of the interactions is likely to be confounded with the block effects but a good choice of blocking should hopefully ensure that it is a higher order interaction that would be challenging to interpret or not be expected to be important that is confounded.

The conf.design package in R is described by its author as a small library contains a series of simple tools for constructing and manipulating confounded and fractional factorial designs. The function conf.design can be used to construct symmetric confounded factorial designs.

A very simple example would be a three factor experiment where each factor has low and high settings (levels). If we wanted to divide the experiment into two blocks of four experimental units then we could confounded the block effect with the three way interaction between the factors. The following code would create the required design plan:

conf.design(rbind(c(1,1,1)), p=2, treatment.names = c("F1","F2","F3"))

The first argument is a matrix, with a single row in this case as there are only two blocks, which specifies the levels of the factors for the effect to be confounded with the blocks. The output from this function call is:

  Blocks F1 F2 F3
1      0  0  0  0
2      0  1  1  0
3      0  1  0  1
4      0  0  1  1
5      1  1  0  0
6      1  0  1  0
7      1  0  0  1
8      1  1  1  1

This shows two blocks, labelled 0 and 1, and the settings of the experiments to run in each block. In the first block the four factor combinations would be:

F1 low, F2 low, and F3 low.
F1 high, F2 high, and F3 low.
F1 high, F2 low, and F3 high.
F1 low, F2 high, and F3 high.

The remaining four combinations are use in the second block of experiments.