Note that this page is in the process of being updated 2022-10-04

For these examples, we are going to work with a fictional data that that has two fully nested (i.e., between-subjects) and two fully crossed (i.e., within-subject) factors. We have two hypothetical groups of older adults (OA, >65 y) and younger adults (YA, <35 y). Within each of those groups, half of the participants were in the control group (C) and half were in the treatment group (T). Each participant was also measured on four different trials (1 to 4) in three different conditions (A, B, and C).

First, we will open the relevant R packages that we will use and then we will download the data from the web. (If you don’t have these packages installed, you will need install the packages before running the library() function.) Once the data are downloaded we can use the head() function to view the first ten rows of the data.

library(tidyverse); library(RCurl); library(ez); library(lme4); library(car); library(lmerTest)

DATA <- read.csv("https://raw.githubusercontent.com/keithlohse/mixed_effects_models/master/data_AGING_example.csv",
                 stringsAsFactors = TRUE)

head(DATA, 10)

For the examples we will walk through, we will ignore different variables at different times. For instance, we will want to average across trials to create a data set with one observation per condition (which we will call data_COND), and we will want to average across conditions to create a data set with one observation for each trial (which we will call data_TIME).

Averaging across trials, here are the first ten rows of data_COND.

data_COND <- DATA %>% group_by(subID, condition, age_group, group) %>% # Note we ignore trial to average across it
  summarize(speed = mean(speed, na.rm=TRUE)) %>% # I have included na.rm=TRUE even though there are no missing data
  arrange(age_group, subID, condition) # finally, we can resort our data by subIDs within groups

## `summarise()` has grouped output by 'subID', 'condition', 'age_group'. You can
## override using the `.groups` argument.

head(data_COND, 10)

Averaging across conditions, here are the first ten rows of data_TIME.

data_TIME <- DATA %>% group_by(subID, time, age_group, group) %>% # Note we ignore condition to average across it
  summarize(speed = mean(speed, na.rm=TRUE)) %>% # I have included na.rm=TRUE even though there are no missing data
  arrange(age_group, subID, time) # finally, we can resort our data by subIDs within groups

## `summarise()` has grouped output by 'subID', 'time', 'age_group'. You can override
## using the `.groups` argument.

head(data_TIME, 10)

With these data in place, we can now test all of our different models. We are going to explore the appropriate mixed-effects regression (MER) models for these different situations:

MER as One-Way Repeated Measures ANOVA
- When we have a single crossed factor (i.e., one within-subjects factor) in our mixed-effects model.
MER as Two-Way Repeated Measures ANOVA
- When we have multiple crossed factors (i.e., two within-subjects factors) in our mixed-effects model.
MER as Mixed-Factorial ANOVA with a Single Within-Subjects Factor
- When we have a single crossed factor (i.e., one within-subjects factor) and a single nested factor (i.e., between-subjects factors) in our mixed-effects model.
MER as Mixed-Factorial ANOVA with Multiple Within-Subjects Factors
- When we have multiple crossed factors (i.e., two within-subjects factor) and a single nested factor (i.e., between-subjects factors) in our mixed-effects model.

1. MER as One-Way Repeated Measures ANOVA

(A Single Crossed Factor)

For this example, we will focus on only the effect of condition, so we will use the data_COND dataset to average across different trials. First, let’s plot the data to get a better sense of what the data look like.

ggplot(data_COND, aes(x = condition, y = speed)) +
  geom_point(aes(fill=condition), pch=21, size=2,
             position=position_jitter(w=0.2, h=0))+
  geom_boxplot(aes(fill=condition), col="black", 
               alpha=0.4, width=0.5, outlier.shape = NA)+
  scale_x_discrete(name = "Condition") +
  scale_y_continuous(name = "Speed (m/s)") +
  theme(axis.text=element_text(size=16, color="black"), 
        axis.title=element_text(size=16, face="bold"),
        plot.title=element_text(size=16, face="bold", hjust=0.5),
        panel.grid.minor = element_blank(),
        strip.text = element_text(size=16, face="bold"),
        legend.position = "none")

1.1. As an ANOVA…

To implement a simple one-way repeated measures ANOVA, we have a few options. We could directly code our ANOVA using the aov() function in R:

summary(aov(speed ~ condition + Error(subID/condition), data=data_COND))

## 
## Error: subID
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 39  2.751 0.07054               
## 
## Error: subID:condition
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## condition  2 0.2820 0.14098   24.46 5.67e-09 ***
## Residuals 78 0.4496 0.00576                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Or we can use the ezANOVA() function from the “ez” package.

ezANOVA(data = data_COND, 
    dv = .(speed),
    wid = .(subID),
    within = .(condition)
)

## $ANOVA
##      Effect DFn DFd        F            p p<.05        ges
## 2 condition   2  78 24.46039 5.674295e-09     * 0.08095761
## 
## $`Mauchly's Test for Sphericity`
##      Effect         W         p p<.05
## 2 condition 0.8955986 0.1230706      
## 
## $`Sphericity Corrections`
##      Effect       GGe        p[GG] p[GG]<.05      HFe        p[HF] p[HF]<.05
## 2 condition 0.9054678 2.494269e-08         * 0.947018 1.300658e-08         *

Although these two different functions present the results in slightly different ways, note that the f-values for the two different omnibus tests match. By either method, the f-observed for the main-effect of condition is F(2,78)=24.46, p<0.001.

If you are familiar with issues of contrast versus treatment coding, Type I versus Type III Sums of Squared Errors, and collinearity, then directly controlling your own ANOVA using base R is probably a safe bet.

If you are not familiar with those terms/issues, then the ezANOVA code is probably the best option for you. However, I would strongly encourage you to develop a more detailed understanding of general-linear models before jumping into mixed-effect models.

By default the aov() function provides a test of your statistical model using Type I Sums of Squared Errors, whereas the ezANOVA() function provides a test of your statistical model using Type III Sums of Squared Errors. In a completely orthogonal design where all factors are statistically independent of each other (i.e., there is no collinearity), the Type I and Type III Sums of Squared Errors will agree. By default, R also uses treatment coding (or “dummy” coding) for categorical factors rather than orthogonal contrast codes. The omnibus F-tests that we see in the ANOVA output will be the same regardless of the types of codes used, but if we dig into individual regression coefficients, it is important to remember how these variables were coded so that we can interpret them correctly.

1.2. Getting the same result with a mixed-effect model…

Because we have a single within-subject factor, we will need to add a random-effect of subject to account for individual differences between subjects. By partitioning the between-subjects variance out of our model, we can fairly test the effect of condition, because our residuals will now be independent of each other.

# First we will define our model
mod1 <- lmer(speed ~ 
               # Fixed Effects:
               condition + 
               # Random Effects: 
               (1|subID), 
             # Define the data: 
             data=data_COND, REML = TRUE)

# We can then get the ANOVA results for our model:
anova(mod1)

Most critically, note that the F-value, F(2,78)=24.46, p<0.001 is identical in both the RM ANOVA and in the mixed-effects regression model.

If we want to delve deeper into our model, we can also use the summary() function to get more information about model fit statistics, parameter estimates, random-effects and residuals.

summary(mod1)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: speed ~ condition + (1 | subID)
##    Data: data_COND
## 
## REML criterion at convergence: -162.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.5766 -0.3736 -0.0165  0.3682  3.8369 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subID    (Intercept) 0.021594 0.14695 
##  Residual             0.005764 0.07592 
## Number of obs: 120, groups:  subID, 40
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)  0.84250    0.02615 52.09109  32.215   <2e-16 ***
## conditionB   0.11831    0.01698 78.00000   6.970    9e-10 ***
## conditionC   0.05050    0.01698 78.00000   2.975   0.0039 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr) cndtnB
## conditionB -0.325       
## conditionC -0.325  0.500

2. MER as Two-Way Repeated Measures ANOVA

(Multiple Crossed Factors)

For this example, we will be analyzing both Time and Condition, so we will use our full data frame, DATA. First, let’s plot the data to get a better sense of what our data look like.

DATA$time <- factor(DATA$time)

ggplot(DATA, aes(x = time, y = speed)) +
  geom_point(aes(fill=condition), pch=21, size=2,
             position=position_jitterdodge(dodge.width = 0.5, jitter.width = 0.1))+
  geom_boxplot(aes(fill=condition), col="black", 
               alpha=0.4, width=0.5, outlier.shape = NA)+
  labs(fill="Condition")+
  scale_x_discrete(name = "Trial") +
  scale_y_continuous(name = "Speed (m/s)", limits = c(0,3)) +
  theme(axis.text=element_text(size=16, color="black"), 
        axis.title=element_text(size=16, face="bold"),
        legend.text=element_text(size=16, color="black"), 
        legend.title=element_text(size=16, face="bold"),
        plot.title=element_text(size=16, face="bold", hjust=0.5),
        panel.grid.minor = element_blank(),
        strip.text = element_text(size=16, face="bold"),
        legend.position = "bottom")

2.1. As an ANOVA…

Before we run our models, we want to convert trial to a factor so that our model is treating time categorically rather than continuously. (In later modules, we will discuss how to mix continuous and categorical factors.) Additionally, remember that we are now using our larger data set DATA rather than our aggregated data set data_COND. To implement a two-way repeated measures ANOVA, we have the same options as before. We can directly code our ANOVA using the aov() function in R:

DATA$time <- factor(DATA$time)
summary(aov(speed ~ condition*time + Error(subID/(condition*time)), data=DATA))

## 
## Error: subID
##           Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 39  11.01  0.2822               
## 
## Error: subID:condition
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## condition  2  1.128  0.5639   24.46 5.67e-09 ***
## Residuals 78  1.798  0.0231                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:time
##            Df Sum Sq Mean Sq F value Pr(>F)    
## time        3 10.618   3.539     106 <2e-16 ***
## Residuals 117  3.909   0.033                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:condition:time
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## condition:time   6 0.7851 0.13084   13.53 3.59e-13 ***
## Residuals      234 2.2626 0.00967                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Or we can use the ezANOVA() function from the “ez” package.

ezANOVA(data = DATA, 
    dv = .(speed),
    wid = .(subID),
    within = .(time, condition)
)

## $ANOVA
##           Effect DFn DFd         F            p p<.05        ges
## 2           time   3 117 105.95095 3.295544e-33     * 0.35881815
## 3      condition   2  78  24.46039 5.674295e-09     * 0.05610418
## 4 time:condition   6 234  13.53149 3.588475e-13     * 0.03973035
## 
## $`Mauchly's Test for Sphericity`
##           Effect         W            p p<.05
## 2           time 0.3262678 5.356910e-08     *
## 3      condition 0.8955986 1.230706e-01      
## 4 time:condition 0.2380896 9.170350e-05     *
## 
## $`Sphericity Corrections`
##           Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF]
## 2           time 0.6917461 9.817867e-24         * 0.7313027 5.956088e-25
## 3      condition 0.9054678 2.494269e-08         * 0.9470180 1.300658e-08
## 4 time:condition 0.6852099 1.068940e-09         * 0.7760402 1.056951e-10
##   p[HF]<.05
## 2         *
## 3         *
## 4         *

Again, regardless of the coding approach that you use, these two different functions produce the same f-observed for the main-effects of time, F(3,117)=105.95, p<0.001, and Condition, F(2,78)=24.46, p<0.001, and the Time x Condition interaction, F(6,234)=13.53, p<0.001.

2.2. Two-way RM ANOVA as a mixed-effect model…

Because we now have two crossed factors, we need to account not only for the fact that we have multiple observations for each participant, but we have multiple observations for each person at each level of each factor.

For instance, for the effect of Condition, we actually have Condition effects at Trial 1, Trial 2, Trial 3, and Trial 4. The converse is also true for the effect of Time, we have four different trials in Condition A, Condition B, and Condition C. Thus, in order to appropriately account for the statistical dependencies in our data, we need to add random-effects of “time:subID” and “condition:subID” to the model.

The colon operator (“:”) means that we are crossing or multiplying these factors. That is, if we have subject ID’s A, B, and C and Trials 1, 2, 3, and 4, then we end up with A1, A2, A3, A4, B1, B2, B3, B4, etc.

For more information on how random-effects are specified and what they mean in R, I recommend looking at this discussion on Stack Exchange: https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified

# First we will define our model
mod2 <- lmer(speed ~ 
               # Fixed Effects:
               time*condition + 
               # Random Effects
               (1|subID)+ (1|time:subID) + (1|condition:subID), 
               # Define your data, 
             data=DATA, REML=TRUE)

# We can then get the ANOVA results for our model:
anova(mod2)

Again, for our purposes, the critical thing to note is that the F-values for the main-effects and interactions are the same between our RM ANOVAs and the mixed-effects model. Without delving into the mathematical details, this is a good demonstration that the appropriate random-effects our regression model make it analogous to the factorial ANOVA. This allows us to capitalize on the benefits of mixed-effects regression for designs that we would normally analyze using factorial ANOVA.

If we want to delve deeper into our model, we can also use the summary() function to get more information about model fit statistics, parameter estimates, random-effects and residuals.

summary(mod2)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: speed ~ time * condition + (1 | subID) + (1 | time:subID) + (1 |  
##     condition:subID)
##    Data: DATA
## 
## REML criterion at convergence: -454.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9509 -0.4451 -0.0673  0.3844  3.2827 
## 
## Random effects:
##  Groups          Name        Variance Std.Dev.
##  time:subID      (Intercept) 0.007912 0.08895 
##  condition:subID (Intercept) 0.003346 0.05785 
##  subID           (Intercept) 0.019616 0.14006 
##  Residual                    0.009669 0.09833 
## Number of obs: 480, groups:  time:subID, 160; condition:subID, 120; subID, 40
## 
## Fixed effects:
##                   Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)        1.00975    0.03184 109.12304  31.716  < 2e-16 ***
## time2             -0.15300    0.02965 249.81033  -5.160 5.03e-07 ***
## time3             -0.25350    0.02965 249.81033  -8.550 1.25e-15 ***
## time4             -0.26250    0.02965 249.81033  -8.853  < 2e-16 ***
## conditionB         0.24625    0.02551 260.37317   9.653  < 2e-16 ***
## conditionC         0.10125    0.02551 260.37317   3.969 9.34e-05 ***
## time2:conditionB  -0.07025    0.03110 234.00008  -2.259  0.02479 *  
## time3:conditionB  -0.19800    0.03110 234.00008  -6.367 1.01e-09 ***
## time4:conditionB  -0.24350    0.03110 234.00008  -7.831 1.68e-13 ***
## time2:conditionC  -0.03825    0.03110 234.00008  -1.230  0.21990    
## time3:conditionC  -0.06425    0.03110 234.00008  -2.066  0.03991 *  
## time4:conditionC  -0.10050    0.03110 234.00008  -3.232  0.00141 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) time2  time3  time4  cndtnB cndtnC tm2:cB tm3:cB tm4:cB
## time2       -0.466                                                        
## time3       -0.466  0.500                                                 
## time4       -0.466  0.500  0.500                                          
## conditionB  -0.401  0.320  0.320  0.320                                   
## conditionC  -0.401  0.320  0.320  0.320  0.500                            
## tim2:cndtnB  0.244 -0.524 -0.262 -0.262 -0.609 -0.305                     
## tim3:cndtnB  0.244 -0.262 -0.524 -0.262 -0.609 -0.305  0.500              
## tim4:cndtnB  0.244 -0.262 -0.262 -0.524 -0.609 -0.305  0.500  0.500       
## tim2:cndtnC  0.244 -0.524 -0.262 -0.262 -0.305 -0.609  0.500  0.250  0.250
## tim3:cndtnC  0.244 -0.262 -0.524 -0.262 -0.305 -0.609  0.250  0.500  0.250
## tim4:cndtnC  0.244 -0.262 -0.262 -0.524 -0.305 -0.609  0.250  0.250  0.500
##             tm2:cC tm3:cC
## time2                    
## time3                    
## time4                    
## conditionB               
## conditionC               
## tim2:cndtnB              
## tim3:cndtnB              
## tim4:cndtnB              
## tim2:cndtnC              
## tim3:cndtnC  0.500       
## tim4:cndtnC  0.500  0.500

3. MER as Mixed-Factorial ANOVA with One Crossed Factor

(2 (Age) x 3 (Condition) ANOVA Example

For this example, we will now consider both the between-subject factor of age, as well as the within-subject factor of condition. Do to this, we will average over the different trials (1, 2, 3, and 4) to get one observation in each condition. This design is sometimes referred as a “mixed-factorial” design, because we have a mix of between-subjects and within-subject factors. Depending on your background, you might be more familiar with this as a “split plot” design. Split plot comes from agronomy (where a lot of statistics where developed!) and refers to the fact that some plots are assigned to different levels of Factor A (e.g., Treatment versus Control), but within each plot there is a second level of randomization to different levels of Factor B (e.g., Conditions A, B, or C). Both terms describe the same thing but differ in their unit of analysis. In psychology, a single person is usually the experimental unit (hence “within-subject” variables) whereas in agronomy or biology a physical area might be the unit of analysis (hence “split-plot” variables). The more general way to talk about these terms is as nested- versus crossed-factors.

For fully nested factors, an experimental unit is represented in only one of the factors (e.g., a person can only be in the treatment group or the control group).
For fully crossed factors, an experimental unit is represented at all levels of the factor (e.g., e.g., a person is tested in conditions A, B, and C).

In our example, participants are nested within age-group, because each person is represented at only one level of that factor (i.e., you are only a younger adult or older adult). However, Condition is a crossed factor, because each person is represented at all levels of Condition (i.e., each person was measured in all three conditions). We can see this more clearly if we plot all of our data.

ggplot(data_COND, aes(x = condition, y = speed)) +
  geom_point(aes(fill=age_group), pch=21, size=2,
             position=position_jitterdodge(dodge.width = 0.5, jitter.width = 0.1))+
  geom_boxplot(aes(fill=age_group), col="black",
               alpha=0.4, width=0.5, outlier.shape=NA)+
  labs(fill="Age Group")+
  scale_x_discrete(name = "Condition") +
  scale_y_continuous(name = "Speed (m/s)") +
  theme(axis.text=element_text(size=16, color="black"), 
        axis.title=element_text(size=16, face="bold"),
        legend.text=element_text(size=16, color="black"), 
        legend.title=element_text(size=16, face="bold"),
        plot.title=element_text(size=16, face="bold", hjust=0.5),
        panel.grid.minor = element_blank(),
        strip.text = element_text(size=16, face="bold"),
        legend.position = "bottom")

3.1. As an ANOVA…

For this mixed factorial ANOVA, we have one factor of Condition that varies within-subjects, but we also have a factor of Age Group that varies between subjects. As before, we can directly code this into our analysis of variance using the aov() function or using the ezANOVA() function from the “ez” package.

summary(aov(speed ~ age_group*condition + Error(subID/condition), data=data_COND))

## 
## Error: subID
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## age_group  1  1.524  1.5238   47.18 3.71e-08 ***
## Residuals 38  1.227  0.0323                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:condition
##                     Df Sum Sq Mean Sq F value   Pr(>F)    
## condition            2 0.2820 0.14098  27.271 1.18e-09 ***
## age_group:condition  2 0.0567 0.02834   5.482  0.00597 ** 
## Residuals           76 0.3929 0.00517                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Or we can use the ezANOVA() function from the “ez” package.

ezANOVA(data = data_COND, 
    dv = .(speed),
    wid = .(subID),
    within = .(condition),
    between = .(age_group)
)

## $ANOVA
##                Effect DFn DFd         F            p p<.05        ges
## 2           age_group   1  38 47.176031 3.706667e-08     * 0.48465601
## 3           condition   2  76 27.271263 1.181245e-09     * 0.14822123
## 4 age_group:condition   2  76  5.481689 5.972183e-03     * 0.03379572
## 
## $`Mauchly's Test for Sphericity`
##                Effect         W         p p<.05
## 3           condition 0.9008222 0.1448182      
## 4 age_group:condition 0.9008222 0.1448182      
## 
## $`Sphericity Corrections`
##                Effect       GGe        p[GG] p[GG]<.05       HFe        p[HF]
## 3           condition 0.9097709 5.566488e-09         * 0.9530301 2.646471e-09
## 4 age_group:condition 0.9097709 7.698688e-03         * 0.9530301 6.815470e-03
##   p[HF]<.05
## 3         *
## 4         *

The ezANOVA() function provides us with more detail, for instance automatically providing Mauchly’s Test for Sphericity and both the Greenhouse-Geisser and Hyunh-Feldt corrections in the event that the sphericity assumption is violated. Most importantly, however, the outputs of these two functions agree when we look at the f-statistics for all main-effects and interactions when sphericity is assumed.

3.2. Mixed Factorial ANOVA as a mixed-effect model…

For this mixed-factorial design, we need to account for the fact that we have multiple observations coming from each person, so we will add a random-effect of “subID”. After accounting for this statistical dependence in our data, we can now fairly test the effects of Age Group, and Condition with residuals that are independent of each other.

# First we will define our model
mod3 <- lmer(speed ~ 
               # Fixed Effects:
               age_group*condition + 
               # Random Effects
               (1|subID), 
               # Define your data, 
             data=data_COND, REML=TRUE)

# We can then get the ANOVA results for our model:
anova(mod3)

Critically, note that the f-statistics and degrees of freedom all match our ANOVA outputs (when sphericity is assumed), with a main-effect of Age Group, F(1,38)=47.18, Condition, F(2,76)=27.27, and the Age Group x Condition interaction, F(2,76)=5.48.

If we want to delve deeper into our model, we can also use the summary() function to get more information about model fit statistics, parameter estimates, random-effects, and residuals.

summary(mod3)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: speed ~ age_group * condition + (1 | subID)
##    Data: data_COND
## 
## REML criterion at convergence: -189.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.6329 -0.4286 -0.0582  0.3525  3.7641 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subID    (Intercept) 0.009044 0.0951  
##  Residual             0.005169 0.0719  
## Number of obs: 120, groups:  subID, 40
## 
## Fixed effects:
##                        Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)             0.70000    0.02666 62.99267  26.258  < 2e-16 ***
## age_groupYA             0.28500    0.03770 62.99267   7.560 2.11e-10 ***
## conditionB              0.16950    0.02274 76.00000   7.455 1.21e-10 ***
## conditionC              0.08875    0.02274 76.00000   3.903 0.000204 ***
## age_groupYA:conditionB -0.10238    0.03215 76.00000  -3.184 0.002107 ** 
## age_groupYA:conditionC -0.07650    0.03215 76.00000  -2.379 0.019862 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ag_gYA cndtnB cndtnC a_YA:B
## age_groupYA -0.707                            
## conditionB  -0.426  0.302                     
## conditionC  -0.426  0.302  0.500              
## ag_grpYA:cB  0.302 -0.426 -0.707 -0.354       
## ag_grpYA:cC  0.302 -0.426 -0.354 -0.707  0.500

4. MER as Mixed-Factorial ANOVA with Multiple Crossed Factors

2 (Age Group) x 3 (Condition) x 4 (Trial) ANOVA

Next, we will consider the more complicated example of our fully factorial design with a nested factor of Age-Group, and crossed factors of Condition and Time.

DATA$time <- factor(DATA$time)

ggplot(DATA, aes(x = time, y = speed)) +
  geom_point(aes(fill=age_group), size=2, shape=21,
             position=position_jitterdodge(dodge.width = 0.5, jitter.width = 0.1))+
  geom_boxplot(aes(fill=age_group), col="black", 
               alpha=0.4, width=0.5, outlier.shape = NA)+
  labs(fill="Age Group")+
  facet_wrap(~condition)+
  scale_x_discrete(name = "Trial") +
  scale_y_continuous(name = "Speed (m/s)", limits = c(0,3)) +
  theme(axis.text=element_text(size=16, color="black"), 
        axis.title=element_text(size=16, face="bold"),
        legend.text=element_text(size=16, color="black"), 
        legend.title=element_text(size=16, face="bold"),
        plot.title=element_text(size=16, face="bold", hjust=0.5),
        panel.grid.minor = element_blank(),
        strip.text = element_text(size=16, face="bold"),
        legend.position = "bottom")

4.1. As an ANOVA…

For this mixed factorial ANOVA, we have two factors that vary within subjects, Condition and Time, and we have one factor that varies between subjects, Age Group. As before, we can directly code this into our analysis of variance using the aov() function or using the ezANOVA() function from the “ez” package.

summary(aov(speed ~ age_group*condition*time + Error(subID/(condition*time)), data=DATA))

## 
## Error: subID
##           Df Sum Sq Mean Sq F value   Pr(>F)    
## age_group  1  6.095   6.095   47.18 3.71e-08 ***
## Residuals 38  4.910   0.129                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:condition
##                     Df Sum Sq Mean Sq F value   Pr(>F)    
## condition            2 1.1278  0.5639  27.271 1.18e-09 ***
## age_group:condition  2 0.2267  0.1133   5.482  0.00597 ** 
## Residuals           76 1.5715  0.0207                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:time
##                 Df Sum Sq Mean Sq F value Pr(>F)    
## time             3 10.618   3.539  113.88 <2e-16 ***
## age_group:time   3  0.365   0.122    3.92 0.0105 *  
## Residuals      114  3.543   0.031                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Error: subID:condition:time
##                           Df Sum Sq Mean Sq F value   Pr(>F)    
## condition:time             6 0.7851 0.13084  13.718 2.71e-13 ***
## age_group:condition:time   6 0.0880 0.01467   1.539    0.166    
## Residuals                228 2.1746 0.00954                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ezANOVA(data = DATA, 
    dv = .(speed),
    wid = .(subID),
    within = .(condition, time),
    between = .(age_group)
)

## $ANOVA
##                     Effect DFn DFd          F            p p<.05        ges
## 2                age_group   1  38  47.176031 3.706667e-08     * 0.33318007
## 3                condition   2  76  27.271263 1.181245e-09     * 0.08462822
## 5                     time   3 114 113.882930 3.746328e-34     * 0.46536693
## 4      age_group:condition   2  76   5.481689 5.972183e-03     * 0.01824443
## 6           age_group:time   3 114   3.919720 1.049688e-02     * 0.02908814
## 7           condition:time   6 228  13.718345 2.705162e-13     * 0.06046299
## 8 age_group:condition:time   6 228   1.538560 1.664920e-01       0.00716581
## 
## $`Mauchly's Test for Sphericity`
##                     Effect         W            p p<.05
## 3                condition 0.9008222 1.448182e-01      
## 4      age_group:condition 0.9008222 1.448182e-01      
## 5                     time 0.3554491 3.872111e-07     *
## 6           age_group:time 0.3554491 3.872111e-07     *
## 7           condition:time 0.2268549 8.327332e-05     *
## 8 age_group:condition:time 0.2268549 8.327332e-05     *
## 
## $`Sphericity Corrections`
##                     Effect       GGe        p[GG] p[GG]<.05       HFe
## 3                condition 0.9097709 5.566488e-09         * 0.9530301
## 4      age_group:condition 0.9097709 7.698688e-03         * 0.9530301
## 5                     time 0.7021534 1.020427e-24         * 0.7443488
## 6           age_group:time 0.7021534 2.194748e-02         * 0.7443488
## 7           condition:time 0.6718683 1.241027e-09         * 0.7615676
## 8 age_group:condition:time 0.6718683 1.933581e-01           0.7615676
##          p[HF] p[HF]<.05
## 3 2.646471e-09         *
## 4 6.815470e-03         *
## 5 4.686178e-26         *
## 6 1.975578e-02         *
## 7 1.230714e-10         *
## 8 1.856659e-01

4.2. Multi-Way Mixed Factorial ANOVA as a mixed-effect model…

For this mixed-factorial design, we need to account for the fact that we have multiple observations coming from each person, but we also need to acocunt for the fact that we multiple observations for each Condition and on each Trial. In order to account for this dependence in our data, we need to include random-effects of subject, subject:condition, and subject:time. Adding these random-effects to our model will make our mixed-effects model statistically equivalent to the mixed-factorial ANOVAs that we ran above.

# First we will define our model
mod4 <- lmer(speed ~ 
               # Fixed Effects:
               age_group*condition*time + 
               # Random Effects
               (1|subID)+(1|condition:subID)+(1|time:subID), 
               # Define your data, 
             data=DATA, REML=TRUE)

# We can then get the ANOVA results for our model:
anova(mod4)

Note that the f-statistics and the degrees of freedom match what we got from the ANOVA tables above (when sphericity is assumed). There are a lot of effects so I wont list them all, but note the main-effects of Age Group, F(1,38)=47.18, Condition, F(2,76)=27.27, and Time, F(3,114)=113.88.

Readers familiar with repeated measures ANOVA wont be surprised to see that that the output of the aov() function partitions the effects into different sections. The between-subject effect of Age Group is evaluated using residual error at the level of “Error:subID”. The effects of Condition and Age Group x Condition are evaluated using residuals of “Error: subID:condition”. This is precisely what we are doing when we include the three different random-intercepts in the mixed-effects regression. The \((1|subID)\) partitions out the average individual differences between subjects, \((1|condition:subID)\) and \((1|time:subID)\) partitions out the variance at the different levels of our repeated measures.

If we want to delve deeper into our model, we can also use the summary() function to get more information about model fit statistics, parameter estimates, random-effects and residuals.

summary(mod4)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: 
## speed ~ age_group * condition * time + (1 | subID) + (1 | condition:subID) +  
##     (1 | time:subID)
##    Data: DATA
## 
## REML criterion at convergence: -463.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.8077 -0.4203 -0.0405  0.3229  3.2902 
## 
## Random effects:
##  Groups          Name        Variance Std.Dev.
##  time:subID      (Intercept) 0.007181 0.08474 
##  condition:subID (Intercept) 0.002785 0.05277 
##  subID           (Intercept) 0.007249 0.08514 
##  Residual                    0.009538 0.09766 
## Number of obs: 480, groups:  time:subID, 160; condition:subID, 120; subID, 40
## 
## Fixed effects:
##                               Estimate Std. Error        df t value Pr(>|t|)
## (Intercept)                    0.79400    0.03657 186.55257  21.710  < 2e-16
## age_groupYA                    0.43150    0.05172 186.55257   8.343 1.58e-14
## conditionB                     0.34850    0.03510 263.60569   9.928  < 2e-16
## conditionC                     0.17750    0.03510 263.60569   5.056 8.01e-07
## time2                         -0.06700    0.04089 249.82467  -1.639 0.102553
## time3                         -0.13950    0.04089 249.82467  -3.412 0.000753
## time4                         -0.16950    0.04089 249.82467  -4.145 4.65e-05
## age_groupYA:conditionB        -0.20450    0.04964 263.60569  -4.119 5.09e-05
## age_groupYA:conditionC        -0.15250    0.04964 263.60569  -3.072 0.002350
## age_groupYA:time2             -0.17200    0.05782 249.82467  -2.975 0.003222
## age_groupYA:time3             -0.22800    0.05782 249.82467  -3.943 0.000105
## age_groupYA:time4             -0.18600    0.05782 249.82467  -3.217 0.001468
## conditionB:time2              -0.15450    0.04368 228.00014  -3.537 0.000489
## conditionC:time2              -0.08950    0.04368 228.00014  -2.049 0.041587
## conditionB:time3              -0.26350    0.04368 228.00014  -6.033 6.44e-09
## conditionC:time3              -0.12300    0.04368 228.00014  -2.816 0.005285
## conditionB:time4              -0.29800    0.04368 228.00014  -6.823 7.94e-11
## conditionC:time4              -0.14250    0.04368 228.00014  -3.263 0.001273
## age_groupYA:conditionB:time2   0.16850    0.06177 228.00014   2.728 0.006868
## age_groupYA:conditionC:time2   0.10250    0.06177 228.00014   1.659 0.098394
## age_groupYA:conditionB:time3   0.13100    0.06177 228.00014   2.121 0.035011
## age_groupYA:conditionC:time3   0.11750    0.06177 228.00014   1.902 0.058388
## age_groupYA:conditionB:time4   0.10900    0.06177 228.00014   1.765 0.078951
## age_groupYA:conditionC:time4   0.08400    0.06177 228.00014   1.360 0.175185
##                                 
## (Intercept)                  ***
## age_groupYA                  ***
## conditionB                   ***
## conditionC                   ***
## time2                           
## time3                        ***
## time4                        ***
## age_groupYA:conditionB       ***
## age_groupYA:conditionC       ** 
## age_groupYA:time2            ** 
## age_groupYA:time3            ***
## age_groupYA:time4            ** 
## conditionB:time2             ***
## conditionC:time2             *  
## conditionB:time3             ***
## conditionC:time3             ** 
## conditionB:time4             ***
## conditionC:time4             ** 
## age_groupYA:conditionB:time2 ** 
## age_groupYA:conditionC:time2 .  
## age_groupYA:conditionB:time3 *  
## age_groupYA:conditionC:time3 .  
## age_groupYA:conditionB:time4 .  
## age_groupYA:conditionC:time4    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## 
## Correlation matrix not shown by default, as p = 24 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it

Section 3: Mixed-Effects Models for Factorial Designs

Keith Lohse, PhD, PStat

2022-10-04

Note that this page is in the process of being updated 2022-10-04

1. MER as One-Way Repeated Measures ANOVA

(A Single Crossed Factor)

1.1. As an ANOVA…

1.2. Getting the same result with a mixed-effect model…

2. MER as Two-Way Repeated Measures ANOVA

(Multiple Crossed Factors)

2.1. As an ANOVA…

2.2. Two-way RM ANOVA as a mixed-effect model…

3. MER as Mixed-Factorial ANOVA with One Crossed Factor

(2 (Age) x 3 (Condition) ANOVA Example

3.1. As an ANOVA…

3.2. Mixed Factorial ANOVA as a mixed-effect model…

4. MER as Mixed-Factorial ANOVA with Multiple Crossed Factors

2 (Age Group) x 3 (Condition) x 4 (Trial) ANOVA

4.1. As an ANOVA…

4.2. Multi-Way Mixed Factorial ANOVA as a mixed-effect model…