Analysis of Variance and Covariance in R

C. Patrick Doncaster

The commands below apply to the freeware statistical environment called R (R Development Core Team 2010). Each set of commands can be copy-pasted directly into R. Example datasets can be copy-pasted into .txt files from Examples of Analysis of Variance and Covariance (Doncaster & Davey 2007). For a given design and dataset in the format of the linked example, the commands will work for any number of factor levels and observations per level.

Each numbered section below shows commands for a design to be analyzed either by the function 'aov' which assumes residuals have a normal distribution, or by the function 'glm' which takes any named error structure. An example of a dataset of proportions analysed with a binomial error structure is given for split-plot model 5.9.

Model descriptors in titles and comments use the notation of Doncaster & Davey (2007), with '|' signifying 'cross-factored with', and '(' signifying 'nested in'. Factor and variable names take uppercase A, B, C, S, P, Q, with a prime identifying a factor as random. The numbers of factor levels take lowercase a, b, c, s, p, q (with no replication of p and q in split-plots). The symbol 'ε' signifies full replication. The R commands for model formulae use notation 'A:B' to request analysis of the interaction of A with B; 'A*B' to request main effects and interaction (i.e., 'A + B + A:B' with 'a' levels of A and 'b' levels of B); 'A/B' to request the main effect A, and B nested in A (i.e., 'A + A:B' with 'a' levels of A and 'b' levels of B per level of A).

Doncaster, C. P. & Davey, A. J. H. (2007) Analysis of Variance and Covariance: How to Choose and Construct Models for the Life Sciences. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511611377. Example datasets: http://www.southampton.ac.uk/~cpd/anovas/datasets/

R Development Core Team (2010). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

- Useful commands in R

- R commands for analysis of ANOVA and ANCOVA datasets

1 One-factor designs

2 Nested designs

3 Fully replicated factorial designs

4 Randomized-block designs

5 Split-plot designs

6 Repeated-measures designs

- Analysis of datasets for figures in Doncaster & Davey (2007)

- Significance of F

- Power calculations for ANOVA designs

Useful commands in R

In the following command lines, built-in functions are highlighted in dark blue. Clicking on a function will link to explanatory help on its uses. These and other built-in functions and programming controls are summarised in the Short R Reference Card, and R Reference Card 2.0.

# this hash-tag symbol on a command line means that everything following it on the same line is comment and not instruction

Help

 help.start()                    # general introduction to the R language

 help(anova)                     # help on anova

 ?anova                          # help on anova

 ??anova                         # help on help files for anova

 ?read.table                     # help on importing data

 ?par                            # help on graphical parameters

Set the working directory

 setwd("C:/R/ANOVA datasets in R")

 getwd()                         # list the working directory

 dir()                           # list files in the working directory

Create vectors

 E <- pi*2/3                     # create a vector E which gets the single-element value of pi*2/3

 F <- c(1, -0.5, 5.2, 3)         # create a vector F of numeric elements

 G <- c("1", "one", ".")         # create a vector G of character elements

 H <- 1:5                        # create a vector H of integers 1 through 5

 J <- gl(4, 3)                   # create vector J of 3 replicates of each of factor levels "1" through "4"

 K <- c(rep(max(as.numeric(B)), length(B))) # create a vector K containing the maximum value in vector B, repeated as many times as the length of B

 YM <- as.vector(tapply(Y,list(A, B), mean)) # create a vector YM which gets the mean value of vector Y at each combination of levels of vectors A and B

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean))) # create a vector AM which gets the levels of factor A that correspond to each element of YM

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B), mean))) # create a vector BM which gets the levels of factor B that correspond to each element of YM

 aovdatam <- data.frame(cbind(AM, BM, YM)) # bind together the three vectors AM, BM, YM, into a data frame object, here called 'aovdatam'

Read in data with column headers from a '.txt' text file in the working directory

Hold the data for an analysis of variance in a 'data frame', comprising equal-length columns (the 'vectors'). The response measurements go in one column, one measurement per row, and each explanatory factor or variable takes a column, showing its level for the corresponding response measurement. Use NA for any missing data. See model 1.1 and other models for examples of databases suitable for reading into a data frame.

 aovdata <- read.table("Model1_1.txt", header = T) # the named data frame (here called 'aovdata') gets (the two-character '<-' symbol) data read from the named text-file table, with column headers

 attach(aovdata)                 # make the data frame accessible by name within the R session

 aovdata                         # see the data frame of column vectors

Read in data from a '.csv' Excel file in the working directory

 aovdata <- read.table("Model1_1.csv", sep = ",", col.names = c("A", "Y"), colClasses = c("factor", "numeric"))

 attach(aovdata)

List objects

 ls()                            # list all objects

 search()                        # list all attached data frames and packages

 names(aovdata)                  # list the vector names in the data frame object

 aovdata                         # view the contents of the named data frame or vector object

 Y[as.numeric(A) > 1]            # list the values of vector Y for which vector A takes values above 1

 class(Y)                        # show the class of vector Y

 length(A)                       # show the length of the vector A

 levels(A)                       # list the levels of the factor A

 nlevels(A)                      # number of levels of factor A

 table(A) ; table(A, B)          # tabulate the number of data rows at each level of factor A, or at each combination of levels of factors A and B

Set the class of a vector

 Y <- as.numeric(Y) ; A <- as.factor(A) # convert vector Y to numeric and vector A to a factor

Transform vectors

 Y1 <- log(Y)                    # create a vector Y1 that is the natural log of the vector Y

 Y2 <- log10(Y)                  # create a vector Y2 that is the log to base 10 of the vector Y

 Y3 <- asin(sqrt(Y)))            # create a vector Y3 that is the arcsin-root transformation of the vector Y

Summary statistics on vectors

 summary(aovdata)                # mean, median, quartiles, min-max of all vectors in the data frame

 mean(Y) ; median(Y)             # arithmetic mean and median of the vector

 min(Y) ; max(Y)                 # minimum and maximum values in the vector

 var(Y)                          # sample variance for values in the vector

 tapply(Y, A, mean)              # mean of vector Y at each level of factor A

 tapply(Y, list(A, B), mean)     # mean of vector Y at each combination of levels of factors A and B

 tapply(Y, A, var)^0.5           # sample standard deviation of vector Y at each level of factor A

 tapply(Y, A, length)            # number of observations of Y at each level of A

 coef(aov(Y ~ X))                # intercept and slope of linear regression of vector Y on numeric vector X

Diagnostic checks on the distribution of the residuals

 plot(aov(Y ~ A))                # plot residuals to an ANOVA in four consecutive graphs: Residuals vs Fitted, Normal Q-Q, Scale-Location, Constant Leverage

 plot(fitted(aov(Y ~ A)), resid(aov(Y ~ A)), xlab = "Fitted values", ylab = "Residuals", main = "residuals vs Fitted") # plot a diagnostic check for heteroscedasticity in the residuals from the response Y to a factor or numeric A

 qqnorm(resid(aov(Y ~ A)), main = "Rankit plot of residuals") # plot normal scores to check for skewness (convex bow for right skew, concave bow for left skew), kurtosis ('S' shape for flattened, inverted 'S' for peaked) and outliers in the residuals from the response Y to a factor or numeric A

 hist(resid(aov(Y ~ A)))              # histogram of residuals

 bartlett.test(Y ~ A)                 # Bartlett test of homogeneity of variances amongst levels of factor A

 bartlett.test(Y ~ interaction(A, B)) # Bartlett test of homogeneity of variances amongst cross-factors A and B

 shapiro.test(resid(aov(Y ~ A)))      # Shapiro-Wilk normality test of residuals

Plot the response to one or more factors

 plot(A, Y) # box and whisker plot of the response Y to a factor A, showing for each level of A its median straddled by the box of 25-50% and 50-75% quartiles, which together make up the interquartile range, and whiskers of minimum and maximum values lying within 1.5 interquartile ranges below the bottom and above the top of the box, and locating any outliers beyond the whiskers

 plot(A:B, Y, las = 1, xlab = "cross-factors A:B", ylab = "Response Y") # box and whisker plot of the response Y to cross-factored A:B, with reading direction of y-axis numbers aligned  left to right, and specified axis labels

 abline(h = 0.0)                 # add a reference line to the plot at Y = 0.0

 interaction.plot(A, B, Y, las = 1, xlab = "Factor A", ylab = "Response Y", trace.label = "Factor B", xtick = TRUE) # interaction plot of cross-factored means

Panel plots of responses to factor A at each level of factor B

The suite of commands below will create a panel of b plots for the means of numeric variable Y at each level of factor A given factor B, from a data frame containing equal-length vectors Y, A with a levels, and B with b levels.

 library(lattice)

 xyplot(Y ~ A|B,

 panel = function(x, y)

   panel.xyplot(x, y)             # plot Y against A at each level of B

   panel.xyplot(x, y, type = "a") # join the means of consecutive levels of A at each level of B

})

Plot factor means with error bars

Click here for a user-defined function that can be saved for future use as a script (e.g., use the RStudio menus: File - New script, copy-pasting it in and saving it as, for example, "PlotMeans.R"). Having run the script, a plot is obtained on each call of the function plot_means(factor, response, xlabel, ylabel, bars) for specified vectors of a factor and response, x-axis label, y-axis label, and type of error bars: SD or SE or CI. For example, plot the means +/-1 standard error given by the data provided for factorial model 1.1 by attaching the data frame and then calling:

 source(file="http://www.southampton.ac.uk/~cpd/anovas/datasets/PlotMeans.R") # run the script, here stored in the script file PlotMeans.R on the web

 plot_means(A, Y, "Factor A", "Response Y", "SE") # plot means and s.e.

Means +/-95% confidence intervals given by the data provided for factorial model 3.1 are plotted by attaching the data frame and then calling:

 source(file="http://www.southampton.ac.uk/~cpd/anovas/datasets/PlotMeans.R")

 plot_means(A:B, Y, "Cross-factors A:B", "Response Y", "CI")

The suite of commands below will plot means and 95% confidence intervals of Y at each level of factor A

 XM <- tapply(as.numeric(A), A, mean)

 YM <- tapply(Y, A, mean)

 SE <- (tapply(Y, A, var)/tapply(Y, A, length))^0.5

 CI <- SE*qt(0.975, (tapply(Y, A, length) - 1))

 plot(XM, YM, type = "p", lwd = 6, xaxt = "n", xlim = c(min(XM), max(XM)), ylim = c(min(YM-CI), max(YM+CI)), las = 1, xlab = "A", ylab = "Y means and 95% C.I.") # plot means

 for (i in 1:length(XM)) {lines(c(XM[i], XM[i]), c(YM[i]-CI[i], YM[i]+CI[i]))} # add 95% confidence intervals

 axis(1, at = XM, labels = names(XM)) # plot x-axis labels at each level of A

Plot the response to a numeric variable

Use the plot commands one at a time, because a new plot will overwrite any previous plot.

 plot(X, Y, las = 1, xlab = "covariate X", ylab = "Response Y") # scatter plot of the response Y to a numeric covariate X

 abline(coef(aov(Y ~ X)))        # add the least-squares regression line to the scatter plot

 library(nlme) ; plot(groupedData(Y ~ X|A), outer = ~1) # scatter plot of Y against covariate X with coloured lines joining consecutive points within each level of A

Plot the response to numeric variable A at each level of factor B

The suite of commands below will plot a graph with linear regressions of numeric variable Y against numeric variable A for each level of factor B, from a data frame containing equal-length vectors Y, A, and B with b levels

 plot(A, Y, type = "n", las = 1, xlab = "A", ylab = "Response Y") # plot the graph axes

 SA <- split(A, B)               # create new vector SA that splits A by levels of B

 SY <- split(Y, B)               # create new vector SY that splits Y by levels of B

 for (i in 1:nlevels(B)) points(SA[[i]], SY[[i]], pch = i - 1)  # add symbols for data points at each level of B (squares, circles, etc)

 for (i in 1:nlevels(B)) abline(lm(SY[[i]] ~ SA[[i]]), lty = i) # add regression lines for each level of B (continuous line, broken line, etc)

Panel plots of responses to numeric variable A at each level of factor B

The suite of commands below will create a panel of plots for the linear regression of numeric variable Y against numeric variable A given factor B, from a data frame containing equal-length vectors Y, A, and B with b levels.

 library(lattice)

 xyplot(Y ~ A|B,

 panel = function(x, y)

   panel.xyplot(x, y)             # plot Y against A at each level of B

   panel.abline(coef(aov(y ~ x))) # add the linear regressions of Y against A at each level of B

})

Open a library package

 library()                       # list all available packages

 library(nlme)                   # open the package of linear and non-linear mixed effects models

 library(help = nlme)            # help on the 'nlme' package

Detach a data frame after completion of analysis

 detach(aovdata)                 # detach the named data frame

 detach()                        # detach the most recently attached data frame

Remove objects from the working environment

 rm(aovdata)                     # remove the named object (e.g. a data frame or a vector)

 rm(list = ls())                 # remove all objects

Clear the console

<ctrl>L

R commands for analysis of ANOVA and ANCOVA datasets

Click here for a zip file containing all of the datasets named below. Copy-paste your own data into a .txt file with the same structure of tab-delimited columns with headers.

1 One-factor designs

1.1 One-factor model Y = A + ε

The commands below use data file 'Model1_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model1_1.txt", header = T)

 attach(aovdata)

 A <- factor(A)

Commands for factorial analysis

 model1_1i <- aov(Y ~ A)         # Create an object (here called 'model1_1i') that gets an analysis of variance with the model: numeric vector Y is explained by factor A

 summary(model1_1i)              # report the results of the analysis

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ A, family = gaussian(link = identity)), test = "F") # deviances equal SS

Commands for plotting Y at each level of factor A

 plot(A, Y, las = 1, xlab = "A", ylab = "Y")# box and whisker plot of the response Y to factor A

 round(tapply(Y, A, mean),2)     # report the mean of Y at each level of factor A, rounded to 2 decimal places

Commands for covariate analysis

 A <- as.numeric(A)

 model1_1i <- aov(Y ~ A)         # Create an object (here called 'model1_1i') that gets an analysis of variance with the model: numeric vector Y is explained by numeric vector A

 summary(model1_1i)              # report the results of the regression

Or equally for estimates of the linear regression coefficients

 A <- as.numeric(A)

 model1_1i <- lm(Y ~ A)

 summary(model1_1i)

Or by glm with the error distribution belonging to a named family

 A <- as.numeric(A)

 anova(glm(Y ~ A, family = gaussian(link = identity)), test = "F")

Commands for plotting the linear regression of Y against numeric A

 plot(A, Y, las = 1, xlab = "A", ylab = "Y")   # scatter plot of the response Y to numeric covariate A

 abline(coef(model1_1i))         # add regression line to scatter plot

 coef(model1_1i)                 # report the intercept and slope of the linear regression

1.1 model Y = A + ε with contrasts on 3-level A

The commands below use data file 'Model1_1contrasts3.txt' on the web for an example analysis.

Prepare the data frame

 aovcont <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model1_1contrasts3.txt", header = T)

 attach(aovcont)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model1_1c <- aov(Y ~ B + C)

 summary(model1_1c)

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model1_1i <- aov(Y ~ A)

 summary.lm(model1_1i)

Or equally

 options(contrasts = c("contr.helmert", "contr.helmert"))

 model1_1i <- lm(Y ~ A)

 summary(model1_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ B + C, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model1_1i <- glm(Y ~ A, family = gaussian(link = identity), contrasts = A)

 summary(model1_1i)

1.1 model Y = A + ε with contrasts on 5-level A

The commands below use data file 'Model1_1contrasts5.txt' on the web for an example analysis.

Prepare the data frame

 aovcont <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model1_1contrasts5.txt", header = T)

 attach(aovcont)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C) ; D <- factor(D) ; E <- factor(E)

Commands for factorial analysis

 model1_1c <- aov(Y ~ B + C + D + E)

 summary(model1_1c)

Or equally

 contrasts(A) <- cbind(c(4,-1,-1,-1,-1), c(0,1,1,-1,-1), c(0,1,-1,0,0), c(0,0,0,1,-1))

 model1_1i <- aov(Y ~ A)

 summary.lm(model1_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ B + C + D + E, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(4,-1,-1,-1,-1), c(0,1,1,-1,-1), c(0,1,-1,0,0), c(0,0,0,1,-1))

 model1_1i <- glm(Y ~ A, family = gaussian(link = identity), contrasts = A)

 summary(model1_1i)

2 Nested designs

2.1 Two-factor nested model Y = B'(A) + ε

The commands below use data file 'Model2_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model2_1.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 A_effect <- aov(Y ~ A + Error(A:B))           # test A against B nested in A

 summary(A_effect)

 anova(aov(Y ~ A), aov(Y ~ A/B), test = "F")   # test A:B

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-B ANOVA for A tested against B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-subjects ANOVA for B

 between_B <- glm(Y ~ A, family = gaussian(link = identity))

 between_S <- glm(Y ~ A/B, family = gaussian(link = identity))

 anova(between_B, between_S, test = "F")

Commands for analysis with covariate A

 # Step 1. Convert A to numeric, and B from b factor levels nested in each level of A into a*b factor levels

 A <- as.numeric(A)

 b <- max(as.numeric(B)) ; B1 <- as.factor(as.numeric(B) + b*(A - 1))

 # Step 2. Between-B1 ANOVA for covariate A tested against B1

 A_effect <- aov(Y ~ A + Error(A:B1))

 summary(A_effect)

 # Step 3. Between-subjects ANOVA for B1

 anova(aov(Y ~ A), aov(Y ~ A/B1), test = "F")

Or by glm for analysis with covariate A

 # Step 1. Convert A to numeric, and B from b levels nested in each level of A into a*b levels

 A <- as.numeric(A)

 b <- max(as.numeric(B)) ; B1 <- as.factor(as.numeric(B) + b*(A - 1))

 # Step 2. Between-B1 ANOVA for covariate A tested against B1

 YM <- as.vector(tapply(Y,list(A, B1), mean))

 AM <- as.vector(tapply(as.numeric(A),list(A, B1), mean))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 3. Between-subjects ANOVA for B1

 between_B <- glm(Y ~ A, family = gaussian(link = identity))

 between_S <- glm(Y ~ A/B1, family = gaussian(link = identity))

 anova(between_B, between_S, test = "F")

2.2 Three-factor nested model Y = C'(B'(A)) + ε

The commands below use data file 'Model2_2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model2_2.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 A_effect <- aov(Y ~ A + Error(A:B))           # test A

 summary(A_effect)

 B_effect <- aov(Y ~ A/B + Error(A:B:C))       # test A:B

 summary(B_effect)                             # ignore the F-value for A

 anova(aov(Y ~ A/B), aov(Y ~ A/B/C), test = "F") # test A:B:C

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-B ANOVA for A tested against B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-C ANOVA for B tested against C

 YM <- as.vector(tapply(Y,list(A, B, C), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, C), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, C), mean)))

 between_B <- glm(YM ~ AM, family = gaussian(link = identity))

 between_C <- glm(YM ~ AM/BM, family = gaussian(link = identity))

 anova(between_B, between_C, test = "F")

 # Step 3. Between-subjects ANOVA for C

 between_C <- glm(Y ~ A/B, family = gaussian(link = identity))

 between_S <- glm(Y ~ A/B/C, family = gaussian(link = identity))

 anova(between_C, between_S, test = "F")

3 Fully replicated factorial designs

3.1 Two-factor fully cross-factored model Y = B|A + ε

The commands below use data file 'Model3_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_1.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model3_1i <- aov(Y ~ A*B)       # A*B requests A + B + A:B

 summary(model3_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ A*B, family = gaussian(link = identity)), test = "F")

Command for interaction plot of cross-factored means

 interaction.plot(A, B, Y, las = 1, xlab = "Factor A", ylab = "Response Y", trace.label = "Factor B", xtick = TRUE)

Commands for analysis with covariate A

 A <- as.numeric(A) ; B <- as.factor(B)

 model3_1i <- aov(Y ~ A*B)

 summary(model3_1i)

Commands for plotting the response to numeric A at each level of factor B

 plot(A, Y, type = "n", las = 1, xlab = "A", ylab = "Response Y") # plot the graph axes

 SA <- split(A, B)               # create new vector SA that splits A by levels of B

 SY <- split(Y, B)               # create new vector SY that splits Y by levels of B

 for (i in 1:nlevels(B)) points(SA[[i]], SY[[i]], pch = i - 1)  # add symbols for data points at each level of B (squares and circles)

 for (i in 1:nlevels(B)) abline(lm(SY[[i]] ~ SA[[i]]), lty = i) # add regression lines for each level of B (continuous and broken)

Commands for panel plots of numeric A at each level of factor B

 library(lattice)

 xyplot(Y ~ A|B,

 panel = function(x, y)

   panel.xyplot(x, y)             # plot Y against A at each level of B

   panel.abline(coef(aov(y ~ x))) # add the linear regressions of Y against A at each level of B

})

Commands for analysis with covariates A and B

 A <- as.numeric(A) ; B <- as.numeric(B)

 model3_1i <- aov(Y ~ A*B)

 summary(model3_1i)

3.1 Random-factor version of model Y = B'|A + ε

The unreplicated version of this design is analyzed by model 4.1.

The commands below use data file 'Model3_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_1.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model3_1ii <- aov(Y ~ A + B + Error(A:B))     # unrestricted model for B

 summary(model3_1ii)

 anova(aov(Y ~ A), aov(Y ~ A + B), aov(Y ~ A*B), test = "F") # restricted alternative for B, and A:B under either model

Or by glm with the error distribution belonging to a named family

 # Step 1. Main effects A + B averaged across subjects and tested against A:B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B), mean)))

 anova(glm(YM ~ AM + BM, family = gaussian(link = identity)), test = "F")

 # Step 2. Interaction A:B

 main_effects <- glm(Y ~ A + B, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B, family = gaussian(link = identity))

 anova(main_effects, full_model, test = "F")

3.1 model Y = B|A + ε with contrasts on 3-level A

The commands below use data file 'Model3_1contrasts3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_1contrasts3.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; C <- factor(C) ; D <- factor(D) ; B <- factor(B)

Commands for factorial analysis

 model3_1c <- aov(Y ~ C/D + B*C/D)

 summary(model3_1c)

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model3_1i <- aov(Y ~ A*B)

 summary.lm(model3_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ C/D + B*C/D, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model3_1i <- glm(Y ~ A*B, family = gaussian(link = identity), contrasts = A)

 summary(model3_1i)

3.1 model Y = B|A + ε with contrasts on 3-level A and on 4-level B

The commands below use data file 'Model3_1contrasts3x4.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_1contrasts3x4.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C) ; D <- factor(D) ; E <- factor(E) ; F <- factor(F) ; G <- factor(G)

Commands for factorial analysis

 model3_1c <- aov(Y ~ C + D + E + F + G + C*E + C*F + C*G + D*E + D*F + D*G)

 summary(model3_1c)

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 contrasts(B) <- cbind(c(1,1,-1,-1), c(1,-1,0,0), c(0,0,1,-1))

 model3_1i <- aov(Y ~ A*B)

 summary.lm(model3_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ C + D + E + F + G + C*E + C*F + C*G + D*E + D*F + D*G, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 contrasts(B) <- cbind(c(1,1,-1,-1), c(1,-1,0,0), c(0,0,1,-1))

 model3_1i <- glm(Y ~ A*B, family = gaussian(link = identity), contrasts = A, B)

 summary(model3_1i)

3.2 Three-factor fully cross-factored model Y = C|B|A + ε

The commands below use data file 'Model3_2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_2.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model3_2i <- aov(Y ~ A*B*C)

 summary(model3_2i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ A*B*C, family = gaussian(link = identity)), test = "F")

3.2 Random-factor version of model Y = C|B'|A + ε

The unreplicated version of this design is analyzed by model 4.2.

The commands below use data file 'Model3_2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_2.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 # Step 1. Main effects A + B (restricted) + C and interaction A:C

 model3_2ii <- aov(Y ~ A*B*C - A:B - C:B - A:B:C + Error(A:B + B:C + A:B:C))

 summary(model3_2ii)

 # Step 2. Add restricted-model analysis of interactions A:B, B:C, and A:B:C

 all_but_B_interactions <- aov(Y ~ A*B*C - A:B - B:C - A:B:C)

 add_AB <- aov(Y ~ A*B*C - B:C - A:B:C)

 add_BC <- aov(Y ~ A*B*C - A:B:C)

 full_model <- aov(Y ~ A*B*C)

 anova(all_but_B_interactions, add_AB, add_BC, full_model, test = "F")

Or by glm with the error distribution belonging to a named family

 # Step 1. Main effects A + B averaged across C and tested against A:B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B), mean)))

 anova(glm(YM ~ AM + BM, family = gaussian(link = identity)), test = "F")

 # Step 2. Add C averaged across A, and tested against B:C

 YM <- as.vector(tapply(Y,list(B, C), mean))

 BM <- factor(as.vector(tapply(as.numeric(B),list(B, C), mean)))

 CM <- factor(as.vector(tapply(as.numeric(C),list(B, C), mean)))

 add_B <- glm(YM ~ BM, family = gaussian(link = identity))

 add_C <- glm(YM ~ BM + CM, family = gaussian(link = identity))

 anova(add_B, add_C, test = "F")

 # Step 3. Add A:C averaged across subjects and tested against A:B:C

 YM <- as.vector(tapply(Y,list(A, B, C), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, C), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, C), mean)))

 CM <- factor(as.vector(tapply(as.numeric(C),list(A, B, C), mean)))

 all_but_AC <- glm(YM ~ AM*BM + BM*CM, family = gaussian(link = identity))

 full_model <- glm(YM ~ AM*BM*CM - AM:BM:CM, family = gaussian(link = identity))

 anova(all_but_AC, full_model, test = "F")

 # Step 4. Add restricted-model analysis of interactions A:B, B:C, and A:B:C

 all_but_B_interactions <- glm(Y ~ A*B*C - A:B - B:C - A:B:C, family = gaussian(link = identity))

 add_AB <- glm(Y ~ A*B*C - B:C - A:B:C, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*B*C - A:B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B*C, family = gaussian(link = identity))

 anova(all_but_B_interactions, add_AB, add_BC, full_model, test = "F")

3.3 Three-factor replicated split-plot model Y = C|B'(A) + ε

The unreplicated version of this design is analyzed by model 5.6.

The commands below use data file 'Model3_3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_3.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model3_3i <- aov(Y ~ A*C + Error(A:B + A:B:C)) # test A*C

 summary(model3_3i)

 anova(aov(Y ~ A/B + A*C), aov(Y ~ A/B*C), test = "F")      # test C:A/B (and calculation of F-value for A:B by hand from residual error)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-B ANOVA for A averaged across C and tested against B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-B ANOVA by model comparisons for sequentially added terms C, A:C, averaged across subjects and tested against A:B:C with (c-1)*(b-a)*a error d.f.

 YM <- as.vector(tapply(Y,list(A, B, C), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, C), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, C), mean)))

 CM <- factor(as.vector(tapply(as.numeric(C),list(A, B, C), mean)))

 between_B <- glm(YM ~ AM/BM, family = gaussian(link = identity))

 add_C <- glm(YM ~ AM/BM + CM, family = gaussian(link = identity))

 add_AC <- glm(YM ~ AM/BM + CM/AM, family = gaussian(link = identity))

 anova(between_B, add_C, add_AC, test = "F")

 # Step 3. Within-B ANOVA by model comparison for sequentially added term A:B:C (and calculation of F-value for A:B by hand from residual error)

 within_B <- glm(Y ~ A/B + C/A, family = gaussian(link = identity))

 full_model <- glm(Y ~ A/B + C/A + A:B:C, family = gaussian(link = identity))

 anova(within_B, full_model, test = "F")

Commands for analysis with covariate C

 A <- as.factor(A) ; B <- as.factor(B) ; C <- as.numeric(C)

 model3_3v <- aov(Y ~ A*C + Error(A:B + A:B:C))        # test A*C

 summary(model3_3v)

 anova(aov(Y ~ A/B + A*C), aov(Y ~ A/B*C), test = "F") # test C:A/B (and calculation of F-value for A:B by hand from residual error)

3.4 Nested cross-factored model Y = C'(B|A) + ε

The commands below use data file 'Model3_4.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model3_4.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model3_4i <-aov(Y ~ A*B + Error(A:B:C))         # test A*B

 summary(model3_4i)

 anova(aov(Y ~ A*B), aov(Y ~ A*B*C), test = "F") # test C

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-C ANOVA for A|B averaged across subjects and tested against C

 YM <- as.vector(tapply(Y,list(A, B, C), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, C), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, C), mean)))

 anova(glm(YM ~ AM*BM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-subjects ANOVA for C

 within_C <- glm(Y ~ A*B, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B + A:B:C, family = gaussian(link = identity))

 anova(within_C, full_model, test = "F")

4 Randomized-block designs

4.1/6.1/7.1 One-factor randomized complete block model Y = S'|A

The fully replicated version of this design is analyzed by the random-factor version of model 3.1.

The commands below use data file 'Model4_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_1.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A)

Commands for factorial analysis

 model4_1i <- aov(Y ~ S + A)

 summary(model4_1i)              # Model-2 output for S

Or equally

 library(nlme)

 anova(lme(Y ~ A, random = ~ 1|S), test = "F")

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ S + A, family = gaussian(link = identity)), test = "F")

4.1 model Y = S'|A with contrasts on A

The commands below use data file 'Model4_1contrasts.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_1contrasts.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-1 factorial analysis

 model4_1c <- aov(Y ~ S + B/C + Error(S + S:B + S:B:C))

 summary(model4_1c)

Commands for Model-2 factorial analysis

 model4_1c <- aov(Y ~ S + B/C)

 summary(model4_1c)

Or equally for Model-2

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model4_1i <- aov(Y ~ S + A)

 summary.lm(model4_1i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ S + B/C, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model4_1i <- glm(Y ~ S + A, family = gaussian(link = identity), contrasts = A)

 summary(model4_1i)

4.1 One-factor balanced incomplete blocks variant of model Y = S'|A

The commands below use data file 'Model4_1BIB.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_1BIB.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A)

Commands for factorial analysis

 library(nlme)

 anova(lme(Y ~ S + A, random = ~ 1|S), test = "F")

Or by glm with the error distribution belonging to a named family

 # Step 1. S adjusted for A

 A_only <- glm(Y ~ A, family = gaussian(link = identity))

 full_model <- glm(Y ~ A + S, family = gaussian(link = identity))

 anova(A_only, full_model, test = "F")

 # Step 2. A adjusted for S

 S_only <- glm(Y ~ S, family = gaussian(link = identity))

 anova(S_only, full_model, test = "F")

4.1 One-factor unreplicated Latin square variant of model Y = S'|A

The commands below use data file 'Model4_1LS.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_1LS.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model4_1LS <- aov(Y ~ A + B + C)

 summary(model4_1LS)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ A + B + C, family = gaussian(link = identity)), test = "F")

4.1 One-factor unreplicated Youden square variant of model Y = S'|A

The commands below use data file 'Model4_1YS.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_1YS.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 anova(aov(Y ~ B + C), aov(Y ~ A + B + C))     # A, adjusted for B and C

 anova(aov(Y ~ A + C), aov(Y ~ A + B + C))     # B, adjusted for A and C

 anova(aov(Y ~ A + B), aov(Y ~ A + B + C))     # C, adjusted for A and B

Or by glm with the error distribution belonging to a named family

 # Step 1. A adjusted for B and C

 BC_only <- glm(Y ~ B + C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A + B + C, family = gaussian(link = identity))

 anova(BC_only, full_model, test = "F")

 # Step 2. B adjusted for A and C

 AC_only <- glm(Y ~ A + C, family = gaussian(link = identity))

 anova(AC_only, full_model, test = "F")

 # Step 3. C adjusted for A and B

 AB_only <- glm(Y ~ A + B, family = gaussian(link = identity))

 anova(AB_only, full_model, test = "F")

4.2/6.2/7.2 Two-factor randomized complete block model Y = S'|B|A

The fully replicated version of this design is analyzed by the random-factor version of model 3.2.

The commands below use data file 'Model4_2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_2.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for Model-1 factorial analysis

 model4_2i <- aov(Y ~ S + A*B + Error(S + S:A + S:B))

 summary(model4_2i)

Commands for Model-2 factorial analysis

 model4_2i <- aov(Y ~ S + A*B)

 summary(model4_2i)

Or equally for Model-2

 library(nlme)

 anova(lme(Y ~ S + A*B, random = ~ 1|S), test = "F")

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ S + A*B, family = gaussian(link = identity)), test = "F")

4.2 model Y = S'|B|A with contrasts on A

The commands below use data file 'Model4_2contrasts.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_2contrasts.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; C <- factor(C) ; D <- factor(D); B <- factor(B)

Commands for Model-1 factorial analysis

 model4_2c <- aov(Y ~ S + C/D + B*C/D - S:B:C:D + Error(S + S:B + S:C + S:C:D + S:B:C + S:B:C/D))

 summary(model4_2c)

Commands for Model-2 factorial analysis

 model4_2i <- aov(Y ~ S + C/D + B*C/D)

 summary(model4_2i)

Or equally for Model-2

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model4_2i <- aov(Y ~ S + A*B)

 summary.lm(model4_2i)

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ S + C/D + B*C/D, family = gaussian(link = identity)), test = "F")

Or equally

 contrasts(A) <- cbind(c(2,-1,-1), c(0,1,-1))

 model4_2i <- glm(Y ~ S + A*B, family = gaussian(link = identity), contrasts = A)

 summary(model4_2i)

4.3 Three-factor randomized complete block model Y = S'|C|B|A

The commands below use data file 'Model4_3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model4_3.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-1 factorial analysis

 model4_3i <- aov(Y ~ S + A*B*C + Error(S + S:A + S:B + S:C + S:A:B + S:A:C + S:B:C))

 summary(model4_3i)

Commands for Model-2 factorial analysis

 model4_3i <- aov(Y ~ S + A*B*C)

 summary(model4_3i)

Or equally for Model-2

 library(nlme)

 anova(lme(Y ~ S + A*B*C, random = ~ 1|S), test = "F")

Or by glm with the error distribution belonging to a named family

 anova(glm(Y ~ S + A*B*C, family = gaussian(link = identity)), test = "F")

5 Split-plot designs

5.1 Two-factor unreplicated split-plot model Y = B|P'(S'|A)

The commands below use data file 'Model5_1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_1.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for Model-2 factorial analysis

 model5_1i <- aov(Y ~ A*B + Error(S/A))

 summary(model5_1i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-P ANOVA for A averaged across B and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, S), mean)))

 anova(glm(YM ~ SM  + AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-P ANOVA by model comparisons for sequentially added terms B, A:B, both tested against pooled interactions with S with a*(b-1)*(s-1) error d.f.

 between_P <- glm(Y ~ A*S, family = gaussian(link = identity))

 add_B <- glm(Y ~ A*S + B, family = gaussian(link = identity))

 add_AB <- glm(Y ~ A*S + B/A, family = gaussian(link = identity))

 anova(between_P, add_B, add_AB, test = "F")

5.2 Three-factor unreplicated split-plot model Y = C|P'(S'|B|A)

The commands below use data file 'Model5_2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_2.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-2 factorial analysis

 model5_2i <- aov(Y ~ A*B*C + Error(S/A:B))

 summary(model5_2i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-P ANOVA for A averaged across C and tested against S

 YM <- as.vector(tapply(Y,list(A, B, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, S), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, S), mean)))

 SM  <- factor(as.vector(tapply(as.numeric(S),list(A, B, S), mean)))

 anova(glm(YM ~ SM + AM*BM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-P ANOVA by model comparisons for sequentially added terms C, A:C, B:C, A:B:C, all tested against pooled interactions with S with (a*b-1)*(c-1)*(s-1) error d.f.

 between_P <- glm(Y ~ A*B*S, family = gaussian(link = identity))

 add_C <- glm(Y ~ A*B*S + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A*B*S + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*B*S + C/A + B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B*S + A*B*C, family = gaussian(link = identity))

 anova(between_P, add_C, add_AC, add_BC, full_model, test = "F")

5.3 Three-factor unreplicated split-plot model Y = C|B|P'(S'|A)

The commands below use data file 'Model5_3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_3.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-2 factorial analysis

 model5_3i <- aov(Y ~ A*B*C + Error(S/A))

 summary(model5_3i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-P ANOVA for A averaged across B & C and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, S), mean)))

 anova(glm(YM ~ SM + AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-P ANOVA by model comparisons for sequentially added terms B, A:B, C, A:C, B:C, A:B:C, all tested against pooled interactions with S with a*(b*c-1)*(s-1) error d.f.

 between_P <- glm(Y ~ A*S, family = gaussian(link = identity))

 add_B <- glm(Y ~ A*S + B, family = gaussian(link = identity))

 add_AB <- glm(Y ~ A*S + B/A, family = gaussian(link = identity))

 add_C <- glm(Y ~ A*S + B/A + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A*S + B/A + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*S + A*B*C - A:B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*S + A*B*C, family = gaussian(link = identity))

 anova(between_P, add_B, add_AB, add_C, add_AC, add_BC, full_model, test = "F")

5.4 Three-factor unreplicated split-plot model Y = C|Q'(B|P'(S'|A))

The commands below use data file 'Model5_4.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_4.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-2 factorial analysis

 model5_4i <- aov(Y ~ A*B*C + Error(S/A/B))

 summary(model5_4i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-P ANOVA for A averaged across B & C and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, S), mean)))

 anova(glm(YM ~ SM + AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-Q ANOVA for B and A:B averaged across C and tested against pooled interactions with S, both with a*(b-1)*(s-1) d.f.

 YM <- as.vector(tapply(Y,list(A, B, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, S), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, B, S), mean)))

 between_P <- glm(YM ~ AM*SM, family = gaussian(link = identity))

 add_B <- glm(YM ~ AM*SM + BM, family = gaussian(link = identity))

 add_AB <- glm(YM ~ AM*SM + BM/AM, family = gaussian(link = identity))

 anova(between_P, add_B, add_AB, test = "F")

 # Step 3. Within-Q ANOVA by model comparisons for sequentially added terms C, A:C, B:C, A:B:C, all tested against pooled interactions with S with a*b*(c-1)*(s-1) error d.f.

 between_Q <- glm(Y ~ A*B*S, family = gaussian(link = identity))

 add_C <- glm(Y ~ A*B*S + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A*B*S + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*B*S + C/A + B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B*S + A*B*C, family = gaussian(link = identity))

 anova(between_Q, add_C, add_AC, add_BC, full_model, test = "F")

5.5 Three-factor unreplicated split-plot model Y = C|P'(B|S'(A))

The commands below use data file 'Model5_5.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_5.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-2 factorial analysis

 model5_5i <- aov(Y ~ A*B*C + Error(S:A + S:A:B))

 summary(model5_5i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-P ANOVA for A averaged across B & C and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, S), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-P ANOVA for B and A:B averaged across C and tested against the interaction with S, both with a*(b-1)*(s-1) d.f.

 YM <- as.vector(tapply(Y,list(A, B, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, S), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, S), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(A, B, S), mean)))

 between_S <- glm(YM ~ AM/SM, family = gaussian(link = identity))

 add_B <- glm(YM ~ AM/SM + BM, family = gaussian(link = identity))

 add_AB <- glm(YM ~ AM/SM + BM/AM, family = gaussian(link = identity))

 anova(between_S, add_B, add_AB, test = "F")

 # Step 3. Within-P ANOVA by model comparisons for sequentially added terms C, A:C, B:C, A:B:C, all tested against pooled interactions with S with a*b*(c-1)*(s-1) error d.f.

 between_P <- glm(Y ~ A*B*S, family = gaussian(link = identity))

 add_C <- glm(Y ~ A*B/S + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A*B/S + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*B/S + C/A + B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B/S + A*B*C, family = gaussian(link = identity))

 anova(between_P, add_C, add_AC, add_BC, full_model, test = "F")

5.6/6.3 Two-factor unreplicated split-plot model Y = B|S'(A)

The fully replicated version of this design is analyzed by model 3.3.

The commands below use data file 'Model5_6.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_6.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model5_6i <- aov(Y ~ A*B + Error(S:A))

 summary(model5_6i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-S ANOVA for A averaged across B and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-S ANOVA by model comparisons for sequentially added terms B, A:B, both tested against A:B:S with a*(b-1)*(s-1) error d.f.

 between_S <- glm(Y ~ A/S, family = gaussian(link = identity))

 add_B <- glm(Y ~ A/S + B, family = gaussian(link = identity))

 add_AB <- glm(Y ~ A/S + B/A, family = gaussian(link = identity))

 anova(between_S, add_B, add_AB, test = "F")

5.7/6.5 Three-factor unreplicated split-plot model Y = C|B|S'(A)

See model 6.5 for a Model-1 analysis of this design. The commands below use data file 'Model5_7.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_7.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-2 factorial analysis

 model5_7i <- aov(Y ~ A*B*C + Error(S:A))

 summary(model5_7i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-S ANOVA for A averaged across B & C, and tested against S

 YM <- as.vector(tapply(Y,list(A, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, S), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-S ANOVA by model comparisons for sequentially added terms B, C, A:B, A:C, A:B:C, all tested against pooled interactions with S, with a*(b*C-1)*(s-1) error d.f.

 between_S <- glm(Y ~ A/S, family = gaussian(link = identity))

 add_B <- glm(Y ~ A/S + B, family = gaussian(link = identity))

 add_C <- glm(Y ~ A/S + B + C, family = gaussian(link = identity))

 add_AB <- glm(Y ~ A/S + B/A + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A/S + B/A + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A/S + A*B*C - A:B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A/S + A*B*C, family = gaussian(link = identity))

 anova(between_S, add_B, add_C, add_AB, add_AC, add_BC, full_model, test = "F")

5.7 model Y = C|B|S'(A') with contrasts on 3-level B

The commands below use data file 'Model5_7contrasts.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_7contrasts.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C) ; D <- factor(D) ; E <- factor(E)

Commands for Model-2 factorial analysis with random factor A'

 model5_7iv_c <- aov(Y ~ A*E*C + A*E*C:D + Error(A:S + A:E + A:C + A:C:D + A:E:C + A:E:C:D))

 summary(model5_7iv_c)           # then calculate F-values by hand

5.8/6.6 Split plot with nesting model Y = C|S'(B'(A))

The commands below use data file 'Model5_8.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_8.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 # To avoid calculating F-values by hand, use the suite of commands in the next section for analysis by glm

 model5_8i <- aov(Y ~ A/B/S*C + Error(A:B + S:A:B))

 summary(model5_8i)              # then calculate F-values by hand

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-B ANOVA for A averaged across S and C and tested against B

 YM <- as.vector(tapply(Y,list(A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B), mean)))

 anova(glm(YM ~ AM, family = gaussian(link = identity)), test = "F")

 # Step 2. Between-subjects ANOVA for B averaged across C and tested against S

 YM <- as.vector(tapply(Y,list(A, B, S), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, S), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, S), mean)))

 between_B <- glm(YM ~ AM, family = gaussian(link = identity))

 between_S <- glm(YM ~ AM/BM, family = gaussian(link = identity))

 anova(between_B, between_S, test = "F")

 # Step 3. Within-subjects ANOVA by model comparisons for sequentially added terms C, A:C, averaged across S and both tested against A:B:C with (c-1)*(b-a)*a error d.f.

 YM <- as.vector(tapply(Y,list(A, B, C), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(A, B, C), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(A, B, C), mean)))

 CM <- factor(as.vector(tapply(as.numeric(C),list(A, B, C), mean)))

 between_S <- glm(YM ~ AM/BM, family = gaussian(link = identity))

 add_C <- glm(YM ~ AM/BM + CM, family = gaussian(link = identity))

 add_AC <- glm(YM ~ AM/BM + CM/AM, family = gaussian(link = identity))

 anova(between_S, add_C, add_AC, test = "F")

 # Step 4. Within-subjects ANOVA by model comparison for sequentially added term A:B:C

 between_S <- glm(Y ~ A/B/S + C/A, family = gaussian(link = identity))

 full_model <- glm(Y ~ A/B/S + C/A + A:B:C, family = gaussian(link = identity))

 anova(between_S, full_model, test = "F")

5.9/6.7 Split plot with nesting model Y = C|S'(B|A)

The commands below use data file 'Model5_9.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_9.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model5_9i <- aov(Y ~ A*B*C + Error(S:A:B))

 summary(model5_9i)

Or by glm with the error distribution belonging to a named family

 # Step 1. Between-subjects ANOVA for A*B averaged across C, all tested against A:B:S with error d.f. = (s-1)*a*b

 YM <- as.vector(tapply(Y,list(S, A, B), mean))

 AM <- factor(as.vector(tapply(as.numeric(A),list(S, A, B), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(S, A, B), mean)))

 SM <- factor(as.vector(tapply(as.numeric(S),list(S, A, B), mean)))

 anova(glm(YM ~ AM*BM/SM - AM:BM:SM, family = gaussian(link = identity)), test = "F")

 # Step 2. Within-subjects ANOVA by model comparisons for sequentially added terms C, A:C, B:C, A:B:C all tested against A:B:C:S with error d.f. = a*b*(c-1)*(s-1)

 between_S <- glm(Y ~ A*B + A:B/S, family = gaussian(link = identity))

 add_C <- glm(Y ~ A*B + A:B/S + C, family = gaussian(link = identity))

 add_AC <- glm(Y ~ A*B + A:B/S + C/A, family = gaussian(link = identity))

 add_BC <- glm(Y ~ A*B + A:B/S + C/A + B:C, family = gaussian(link = identity))

 full_model <- glm(Y ~ A*B*C + A:B/S, family = gaussian(link = identity))

 anova(between_S, add_C, add_AC, add_BC, full_model, test = "F")

5.9/6.7 model Y = C|S'(B|A) on a response of proportions

The commands below use data file 'Model5_9binomial.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model5_9binomial.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis by aov or glm with a normal error distribution of the transformed response

Apply an arcsine-root transformation to the response, and then use commands as in the previous subsection for aov or glm.

 Y <- asin(sqrt(Su/(Su + Fa)))

Or by glm with the error distribution of frequencies belonging to the binomial family

 # Step 1. Between-subjects ANOVA for A*B averaged across C, all tested against A:B:S with error d.f. = (s-1)*a*b

 SuM <- as.integer(tapply(Su,list(S, A, B), mean))

 FaM <- as.integer(tapply(Fa,list(S, A, B), mean))

 YM <- cbind(SuM,FaM)

 AM <- factor(as.vector(tapply(as.numeric(A),list(S, A, B), mean)))

 BM <- factor(as.vector(tapply(as.numeric(B),list(S, A, B), mean)))

 SM  <- factor(as.vector(tapply(as.numeric(S),list(S, A, B), mean)))

 anova(glm(YM ~ AM*BM/SM - AM:BM:SM , family = binomial), test = "Chisq")

 # Step 2. Within-subjects ANOVA by model comparisons for sequentially added terms C, A:C, B:C, A:B:C all tested against A:B:C:S with error d.f. = a*b*(c-1)*(s-1)

 Y <- cbind(Su,Fa)

 between_S <- glm(Y ~ A*B + A:B/S, family = binomial)

 add_C <- glm(Y ~ A*B + A:B/S + C, family = binomial)

 add_AC <- glm(Y ~ A*B + A:B/S + C/A, family = binomial)

 add_BC <- glm(Y ~ A*B + A:B/S + C/A + B:C, family = binomial)

 full_model <- glm(Y ~ A*B*C + A:B/S, family = binomial)

 anova(between_S, add_C, add_AC, add_BC, full_model, test = "Chisq")

 # If the residual deviance >> residual d.f., then significances from chi-square tests will be inflated by "overdispersion". One way to compensate for overdispersion is to treat the deviance for each term as a Sum of Squares, and calculate the Mean Squares and F by hand (e.g., M. J. Crawley 2002 Statistical Computing, Wiley, pp 526-532). This needs doing manually for each term, obtaining its Error Mean Square by dividing the residual deviance of the last-entered term in a step by its residual d.f. (requesting: test = "F" will produce an F-value calculated by an alternative method that does not compensate for overdispersion, and may differ little in significance from the chi-square).

6 Repeated-measures designs

6.1 One-factor repeated-measures model Y = S'|A

Analysis of this model is exactly as for model 4.1.

The fully replicated version of this design is analyzed by the random-factor version of model 3.1.

6.2 Two-factor repeated-measures model Y = S'|B|A

Analysis of this model is exactly as for model 4.2.

6.3 Two-factor model with repeated-measures on one factor Y = B|S'(A)

Analysis of this model is exactly as for model 5.6.

The fully replicated version of this design is analyzed by model 3.3.

6.4 Repeated measures on nested cross factors model Y = C'(B)|S'(A)

The commands below use data file 'Model6_4.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model6_4.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for factorial analysis

 model6_4i <- aov(aov(Y ~ A/S + B/C + A:B + A/S:B + A:B/C + A/S:B/C))

 summary(model6_4i)              # then calculate F-values by hand

6.5 Three-factor unreplicated split-plot model Y = C|B|S'(A)

See model 5.7 for a Model-2 analysis of this design.

The commands below use data file 'Model6_5.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Model6_5.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B) ; C <- factor(C)

Commands for Model-1 factorial analysis

 model6_5i <- aov(Y ~ A*B*C + Error(S:A + B:S:A + C:S:A))

 summary(model6_5i)

6.6 Nested model with repeated measures on a cross factor Y = C|S'(B'(A))

Analysis of this model is exactly as for model 5.8.

6.7 Three-factor model with repeated measures on one factor Y = C|S'(B|A)

Analysis of this model is exactly as for model 5.9.

Analysis of datasets for figures in Doncaster & Davey (2007)

Figure 1 model 1.1 Y = A + ε

The commands below use data file 'Fig1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig1.txt", header = T)

 attach(aovdata)

 A <- factor(A)

Commands for factorial analysis

 model1_1i <- aov(Y ~ A)

 summary(model1_1i)

Figure 2 regression model 1.1 Y = X + ε

The commands below use data file 'Fig2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig2.txt", header = T)

 attach(aovdata)

Commands for regression analysis

 model1_1reg <- aov(Y ~ X)

 summary(model1_1reg)

Or equally for estimates of regression coefficients

 model1_1reg <- lm(Y ~ X)

 summary(model1_1reg)

Figure 3 model 1.1 Y = X + ε

The commands below use data file 'Fig3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig3.txt", header = T)

 attach(aovdata)

Commands for factorial analysis

 model1_1reg <- aov(Y ~ X)

 summary(model1_1reg)

Figure 4 model 2.1 Y = B'(A) + ε

The commands below use data file 'Fig4.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig4.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model2_1i <- aov(Y ~ A + Error(A:B))          # test A

 summary(model2_1i)

 anova(aov(Y ~ A), aov(Y ~ A/B), test = "F")   # test A:B

Figure 5 model 3.1 Y = B|A + ε

The commands below use data file 'Fig5.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig5.txt", header = T)

 attach(aovdata)

Commands for factorial analysis

 anova(glm(Y ~ A*B), test = "F")

Figure 6(a) model 3.1 Y = B|A + ε

The commands below use data file 'Fig6a.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig6a.txt", header = T)

 attach(aovdata)

Commands for factorial analysis

 model3_1 <- aov(Y ~ A*B)

 summary(model3_1)

Figure 6(b) model 4.2 Y = S|B|A

The commands below use data file 'Fig6b.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig6b.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for Model-1 factorial analysis

 model4_2i <- aov(Y ~ S + A*B + Error(S + S:A + S:B))

 summary(model4_2i)

Commands for Model-2 factorial analysis

 model4_2i <- aov(Y ~ S + A*B)

 summary(model4_2i)

Figure 7(a) model 5.1 Y = B|P'(S'|A)

The commands below use data file 'Fig7a.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig7a.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model5_1i <- aov(Y ~ A*B + Error(S/A))

 summary(model5_1i)

Figure 7(b) model 5.6 Y = B|S(A)

The commands below use data file 'Fig7b.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig7b.txt", header = T)

 attach(aovdata)

 S <- factor(S) ; A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model5_6i <- aov(Y ~ A*B + Error(S:A))

 summary(model5_6i)

Figure 8 covariate model 3.1 Y = X|A + ε

The commands below use data file 'Fig8.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig8.txt", header = T)

 attach(aovdata)

 A <- as.factor(A) ; X <- as.numeric(X)

Commands for covariate analysis

 model3_1 <- aov(Y ~ X*A)

 summary(model3_1)

Figure 9 regression model 1.1 Y = X + ε

The commands below use data file 'Fig9.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig9.txt", header = T)

 attach(aovdata)

Commands for regression analysis

 model1_1reg <- aov(Y ~ X)

 summary(model1_1reg)

Figure 10 model 3.1 Y = B|A + ε

The commands below use data file 'Fig10.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig10.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model3_1 <- aov(Y1 ~ A*B)

 summary(model3_1)

Figure 11(a) two-covariate model 3.1 r = t + tplus1 + ε

The commands below use data file 'Fig11a.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig11a.txt", header = T)

 attach(aovdata)

 t <- as.numeric(t) ; tplus1 <- as.numeric(tplus1)

Commands for covariate analysis

 anova(aov(r ~ tplus1), aov(r ~ t + tplus1))   # t, adjusted for tplus1

 anova(aov(r ~ t), aov(r ~ t + tplus1))        # tplus1, adjusted for t

Figure 11(b) two-covariate model 3.1 r = t + tplus1 + ε

The commands below use data file 'Fig11b.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Fig11b.txt", header = T)

 attach(aovdata)

 t <- as.numeric(t) ; tplus1 <- as.numeric(tplus1)

Commands for factorial analysis

 anova(aov(r ~ tplus1), aov(r ~ t + tplus1))   # t, adjusted for tplus1

 anova(aov(r ~ t), aov(r ~ t + tplus1))        # tplus1, adjusted for t

Worked example 1: model 2.1 Y = B'(A) + ε

The commands below use data file 'Worked1.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Worked1.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model2_1i <- aov(Y ~ A + Error(A:B))          # test A

 summary(model2_1i)

 anova(aov(Y ~ A), aov(Y ~ A/B), test = "F")   # test A:B

Worked example 2: model 3.1 Y = B|A + ε

The commands below use data file 'Worked2.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Worked2.txt", header = T)

 attach(aovdata)

 A <- factor(A) ; B <- factor(B)

Commands for factorial analysis

 model3_1 <- aov(Y ~ A*B)

 summary(model3_1)

Worked example 3: model 3.3 Y = C|B'(A) + ε

The commands below use data file 'Worked3.txt' on the web for an example analysis.

Prepare the data frame

 aovdata <- read.table("http://www.southampton.ac.uk/~cpd/anovas/datasets/R/Worked3.txt", header = T)

 attach(aovdata)

 Shore <- factor(Shore) ; Recruit <- factor(Recruit) ; Trtmnt <- factor(Trtmnt)

Commands for factorial analysis

 model3_3i <- aov(Density ~ Recruit*Trtmnt + Error(Recruit:Shore + Recruit:Shore:Trtmnt))

 summary(model3_3i)

 anova(aov(Density ~ Recruit/Shore + Recruit*Trtmnt), aov(Density ~ Recruit/Shore*Trtmnt), test = "F") # test Trtmnt:Recruit/Shore (and calculation of F-value for Recruit:Shore by hand from residual error)

Commands for analysis with covariate Trtmnt

 Shore <- as.factor(Shore) ; Recruit <- as.factor(Recruit) ; Trtmnt <- as.numeric(Trtmnt)

 model3_3i <- aov(Density ~ Recruit*Trtmnt + Error(Recruit:Shore + Recruit:Shore:Trtmnt))

 summary(model3_3i)

 anova(aov(Density ~ Recruit/Shore + Recruit*Trtmnt), aov(Density ~ Recruit/Shore*Trtmnt), test = "F") # test Trtmnt:Recruit/Shore (and calculation of F-value for Recruit:Shore by hand from residual error)

Significance of F

Example calculation of significance p of an observed F_df1_,df2 = q

 1 - pf(q = 5.98, df1 = 1, df2 = 6)

 [1] 0.05010252

 2*(1 - pt(q = sqrt(5.98), df = 6)) # two-tailed p at t = sqrt(F) equals p at F

 [1] 0.05010252

Example calculation of value of F_df1_,df2 at critical α = 1 - p

 qf(p = 1-0.05, df1 = 1, df2 = 6)

 [1] 5.987378

 qt(p = 1-0.05/2, df = 6)^2 # t^2 at two-tailed alpha equals F at alpha

 [1] 5.987378

Power calculations for ANOVA designs

Power calculations for a term in any ANOVA design require values of all but any one of these parameters:

- Critical probability of Type-I error, α

- Test degrees of freedom, df1

- Error degrees of freedom, df2

- Effective sample size, n

- Effect size variance in the sampled population, θ²

- Error variance in the sampled population, σ²

- Power of the design to detect an effect at α, given θ²/σ²

If the term is fixed, its non-centrality parameter, ncp = df1*n*θ²/σ²

If the term is random, its expected value of F, evf = 1 + n*θ²/σ²

Calculation of power for terms in any design for ANOVA

Example of power calculation for a fixed treatment term given α, df1, df2, n, θ²/σ²

 # Declare quantities

 alpha <- 0.05 ; df1 <- 2 ; df2 <- 9 ; n <- 24 ; theta2 <- 0.25 ; sigma2 <- 1

 # Calculate power of fixed term

 ncp <- df1*n*theta2/sigma2

 power_fixed <- 1 - pf(q = qf(p = 1-alpha, df1, df2), df1, df2, ncp)

 power_fixed

 [1] 0.743147

Example of power calculation for a random term given α, df1, df2, n, θ²/σ²

 # Declare quantities

 alpha <- 0.05 ; df1 <- 2 ; df2 <- 9 ; n <- 24 ; theta2 <- 0.25 ; sigma2 <- 1

 # Calculate power of random term

 evf <- 1 + n*theta2/sigma2

 power_random <- 1 - pf(q = qf(p = 1-alpha, df1, df2)/evf, df1, df2)

 power_random

 [1] 0.5653277

Power arguments for a fixed treatment for one-way ANOVA

Example calculation of n for one-way ANOVA given σ², θ², power, significance level α

 power.anova.test(groups = 2, n = NULL, within.var = 1, between.var = 1.6, power = 0.9, sig.level = 0.05)

 # yields n = 7.668471

Example calculation of θ² for one-way ANOVA given n, σ², power, significance level α

 power.anova.test(groups = 2, n = 6, within.var = 1, between.var = NULL, power = 0.9, sig.level = 0.05)

 # yields θ^2 = 2.164521

Power arguments for a one- or two-sample t test

Example calculation of n for a two-sample t-test of difference given σ, true difference delta, power, significance level α

 power.t.test(n = NULL, sd = 45, delta = 26, power = 0.9, sig.level = 0.05, type = "two.sample", alternative = "two.sided")

 # yields n = 63.92708

Example calculation of true size delta for a one-sample t-test of directional difference given n, σ, power, significance level α

 power.t.test(n = 10, sd = 45, delta = NULL, power = 0.9, sig.level = 0.05, type = "one.sample", alternative = "one.sided")

 # yields delta = 45.21825

Code formatted by Pretty R at inside-R.org ... RIP. Pages maintained by C. Patrick Doncaster, 2 July 2024

Analysis of Variance and Covariance in R

C. Patrick Doncaster

Contents

Useful commands in R

Help

Set the working directory

Create vectors

Read in data with column headers from a '.txt' text file in the working directory

Read in data from a '.csv' Excel file in the working directory

List objects

Set the class of a vector

Transform vectors

Summary statistics on vectors

Diagnostic checks on the distribution of the residuals

Plot the response to one or more factors

Panel plots of responses to factor A at each level of factor B

Plot factor means with error bars

Plot the response to a numeric variable

Plot the response to numeric variable A at each level of factor B

Panel plots of responses to numeric variable A at each level of factor B

Open a library package

Detach a data frame after completion of analysis

Remove objects from the working environment

Clear the console

R commands for analysis of ANOVA and ANCOVA datasets

1 One-factor designs

1.1 One-factor model Y = A + ε

Prepare the data frame

Commands for factorial analysis

Or by glm with the error distribution belonging to a named family

Commands for plotting Y at each level of factor A

Commands for covariate analysis

Or equally for estimates of the linear regression coefficients

Or by glm with the error distribution belonging to a named family

Commands for plotting the linear regression of Y against numeric A

1.1 model Y = A + ε with contrasts on 3-level A

Prepare the data frame

Commands for factorial analysis

Or equally

Or equally

Or by glm with the error distribution belonging to a named family

Or equally

1.1 model Y = A + ε with contrasts on 5-level A

Prepare the data frame

Commands for factorial analysis

Or equally

Or by glm with the error distribution belonging to a named family

Or equally

2 Nested designs

2.1 Two-factor nested model Y = B'(A) + ε

Prepare the data frame

Commands for factorial analysis

Or by glm with the error distribution belonging to a named family

Commands for analysis with covariate A

Or by glm for analysis with covariate A

2.2 Three-factor nested model Y = C'(B'(A)) + ε

Prepare the data frame

Commands for factorial analysis

Or by glm with the error distribution belonging to a named family

3 Fully replicated factorial designs

3.1 Two-factor fully cross-factored model Y = B|A + ε

Prepare the data frame

Commands for factorial analysis

Or by glm with the error distribution belonging to a named family

Command for interaction plot of cross-factored means

Commands for analysis with covariate A

Commands for plotting the response to numeric A at each level of factor B

Commands for panel plots of numeric A at each level of factor B

Commands for analysis with covariates A and B

3.1 Random-factor version of model Y = B'|A + ε

Prepare the data frame

Commands for factorial analysis

Or by glm with the error distribution belonging to a named family

3.1 model Y = B|A + ε with contrasts on 3-level A

Prepare the data frame

Commands for factorial analysis

Or equally

Or by glm with the error distribution belonging to a named family

Or equally

3.1 model Y = B|A + ε with contrasts on 3-level A and on 4-level B