Working with a new data set, we now want to switch from modeling time as a factor (i.e., treating Trial 1 as categorically different from Trial 2) to treating time as a continuous variable.
In our previous data set, we used the hypothetical cross-sectional study of younger and older adults. Both groups (hypothetically) walked in anxiety provoking conditions (let’s say we simulated a virtual alligator behind them) that initially led them to walk faster than they normally would. After repeated exposures however (4 trials), both groups started to walk slower. In this experiment, our time variable is the number of trials. Each trial came at approximately the same time for each person, so there is no between-subject variability in the time variable.
For the new example though, we will use some fake data that have a structure more like what we might encounter in a longitudinal study. These fake data follow patients with different types of spinal cord injury (given by “AIS Grade”) through 18 months of rehabilitation. The outcome is their Rasch-scaled score on the Functional Independence Measure (FIM). Without getting into the weeds of what that means, the values range from 0 to 100, with 100 indicating complete independence in the activities of daily life. Beyond having a lot more observations per person, this dataset also measures time continuously. That is, rather than trials that were all collected at exactly the same time, these data were all collected on different days for different people. Thus, Month 1 as a time point might be Day 20 for some people, but Day 30 for others. One of the strengths of the mixed-effects model is that we can retain this variability in our \(X\) variable, by treating time continuously rather than categorically.
We will explore these data in more detail below. One of the key differences between these models and the factorial models we considered before is that now we will need to deal with model comparisons. In the factorial designs, we knew which factors we wanted to include in our model and we always tested the fully factorial design (i.e., all main-effects and interactions). In contrast, for these longitudinal models, we will need to compare different ways to represent time (e.g., linear versus curvilinear models). After we compare models to decide on the best to model time, we can start adding in other variables to explain the differences in individual trajectories.
In order to do this, we will need to use Full Maximum Likelihood to estimate the different model parameters rather than Restricted Maximum Likelihood. Additionally, we will need to decide on a metric for deciding when one model is statistically better than another model. I will cover these topics briefly below, but I would direct interested readers to Jeff Long’s, Longitudinal Data Analysis for the Behavioral Science using R for a more detailed discussion.
Starting with modeling changes in functional independence across 18 months of rehabilitation, we can test a series of unconditional random-effects models to decide how we want to model the effect of time. These models are said to be “unconditional” because the effect of time does not depend on any other variables. In this series of models, the goal is to establish the best fitting “shape” of our time variable (e.g., should time be linear or quadratic?) and which random-effects are essential to include in the model.
In these models, I am first going to convert time from months into years in order to avoid any scaling issues in our models. You could also center time around the average time, which I call year.c in the models below (\(year_i - \overline{year}\)). Centering time around it’s mean changes the interpretation of the intercept from being the predicted value when Time is 0, to the predicted value on average over time.
# Centering time on the first point and converting to years:
DAT2$year.0 <- (DAT2$time-1)/12
DAT2$year.0_sq<-DAT2$year.0^2
DAT2$year.0_cu<-DAT2$year.0^3
# Centering time on the mean time:
DAT2$year.c <- scale(DAT2$time)
DAT2$year.c_sq<-DAT2$year.c^2
DAT2$year.c_cu<-DAT2$year.c^3
Centering time on the first time point and centering time on the mean are equally appropriate choices mathematically, but they do change how we will interpret the intercept, so it is very important to understand how your variables are coded. In general, treating the first data point as time = 0 makes the intercept more interpretable. However, mean-centering (or z-transforming) the time variable can help reduce collinearity in a model, especially when interactions are present.
# Random intercepts model ----
raneff_00<-lmer(rasch_FIM~
# Fixed-effects
1+
# Random-effects
(1|subID), data=DAT2, REML=FALSE,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=5e5)))
summary(raneff_00)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: rasch_FIM ~ 1 + (1 | subID)
## Data: DAT2
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 5e+05))
##
## AIC BIC logLik deviance df.resid
## 5865.9 5879.6 -2929.9 5859.9 717
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9937 -0.4357 0.2207 0.6874 1.6748
##
## Random effects:
## Groups Name Variance Std.Dev.
## subID (Intercept) 90.95 9.537
## Residual 176.15 13.272
## Number of obs: 720, groups: subID, 40
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 44.904 1.587 40.000 28.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Random slope random intercepts model ----
raneff_01<-lmer(rasch_FIM~
# Fixed-effects
1+year.0+
# Random-effects
(1+year.0|subID), data=DAT2, REML=FALSE,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=5e5)))
summary(raneff_01)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: rasch_FIM ~ 1 + year.0 + (1 + year.0 | subID)
## Data: DAT2
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 5e+05))
##
## AIC BIC logLik deviance df.resid
## 4824.9 4852.4 -2406.5 4812.9 714
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.7246 -0.4827 0.1478 0.6544 2.1393
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## subID (Intercept) 46.48 6.818
## year.0 46.85 6.844 0.37
## Residual 35.48 5.956
## Number of obs: 720, groups: subID, 40
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 25.855 1.165 39.979 22.20 <2e-16 ***
## year.0 25.367 1.195 39.981 21.23 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## year.0 0.175
# Random quadratic slopes and intercepts model ----
raneff_02<-lmer(rasch_FIM~
# Fixed-effects
1+year.0+year.0_sq+
# Random-effects
(1+year.0+year.0_sq|subID), data=DAT2, REML=FALSE,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=5e5)))
summary(raneff_02)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: rasch_FIM ~ 1 + year.0 + year.0_sq + (1 + year.0 + year.0_sq |
## subID)
## Data: DAT2
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 5e+05))
##
## AIC BIC logLik deviance df.resid
## 4086.4 4132.2 -2033.2 4066.4 710
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.4974 -0.5421 -0.0056 0.5677 3.0697
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## subID (Intercept) 48.03 6.931
## year.0 234.10 15.300 -0.02
## year.0_sq 35.94 5.995 0.02 -0.94
## Residual 10.57 3.251
## Number of obs: 720, groups: subID, 40
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 15.969 1.152 39.907 13.86 <2e-16 ***
## year.0 64.492 2.648 40.391 24.36 <2e-16 ***
## year.0_sq -25.786 1.170 41.270 -22.03 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) year.0
## year.0 -0.125
## year.0_sq 0.148 -0.925
# Random cubic slopes and intercepts model ----
raneff_03<-lmer(rasch_FIM~
# Fixed-effects
1+year.0+year.0_sq+year.0_cu+
# Random-effects
(1+year.0+year.0_sq+year.0_cu|subID), data=DAT2, REML=FALSE,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=2e5)))
summary(raneff_03)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: rasch_FIM ~ 1 + year.0 + year.0_sq + year.0_cu + (1 + year.0 +
## year.0_sq + year.0_cu | subID)
## Data: DAT2
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05))
##
## AIC BIC logLik deviance df.resid
## 3844.1 3912.8 -1907.1 3814.1 705
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.0320 -0.5105 -0.0032 0.5039 3.2489
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## subID (Intercept) 43.87 6.624
## year.0 626.89 25.038 -0.05
## year.0_sq 921.27 30.352 0.01 -0.88
## year.0_cu 137.88 11.742 0.04 0.76 -0.98
## Residual 6.59 2.567
## Number of obs: 720, groups: subID, 40
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 11.921 1.111 39.984 10.731 2.44e-13 ***
## year.0 96.527 4.473 40.075 21.580 < 2e-16 ***
## year.0_sq -78.115 5.740 39.993 -13.610 < 2e-16 ***
## year.0_cu 22.770 2.295 40.418 9.923 2.15e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) year.0 yr.0_s
## year.0 -0.177
## year.0_sq 0.145 -0.895
## year.0_cu -0.100 0.791 -0.978
Now that we have created several different models, how do we decide which models are the best? How do we decide which parameters to include and which parameters are “statistically significant”? When it comes to mixed-effect linear models (or other forms of “multi-level” models), we use a similar criterion to traditional regression. That is, we still rely on the general idea that: \[ Data_i = Model_i + Error_i \]
… and if the error is reduced by a large enough magnitude, then we will call that effect statistically significant (i.e., the parameter reduced error by an unusually larger amount under the null-hypothesis.). However, there are some key differences between mixed-effect models and traditional Ordinary Least Squares regression that we need to discuss.
For instance, you might have noticed that p-values are conspicuously absent from the LME4 output. The reasons for this are complicated, but it has to do with the fact that these models can have statistical dependencies and unbalanced/missing data that do not make the calculation of traditional p-values tenable. In short, your denominator degrees of freedom can get a little crazy. (You can also read an explanation from Doug Bates, a main author of the lme4 package, here: https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html). However, the key things that you should know are:
If you really, really want p-values for individual parameters, you can get them from packages that implement the Welch-Satterthwaite approximation to estimate the appropriate degrees of freedom, like the “lmerTest”” package.
We are most interested in comparisons between models and less so the individual parameters within a model. All of our models were fit using Full Maximum Likelihood Estimation and we judge models based on a reduction in ERROR that we call Deviance. Fortunately, there are several quantitative and objective methods for evaluating the change in deviance. We will focus on two of these methods, the Wald Test for the change in Deviance and the Akaike Information Criterion (AIC).
You can read more about Full Maximum Likelihood Estimation from other resources, but conceptually the idea of maximum likelihood estimation is that our computer tests a long series of parameters until it arrives at the specific set of parameters that lead to the smallest amount of error in our data. This is why it is referred to as “maximum likelihood”, because we arrive at the set of values (for the given parameters) that are most likely to have produced the data we observed. The goodness of fit for these parameter estimates is quantified in the Deviance, which is a transformation of the Likelihood.
\[ Deviance = -2*log(Likelihood) \]
Where likelihood is defined by the amount of error (\(\epsilon\)) left behind by our estimates. Thus, if we delve a little deeper, the deviance is:
\[ Deviance = N*log(2\pi\sigma^2_\epsilon)+(1/\sigma^2_\epsilon)*(\sum(\epsilon^2_i)) \]
This formula is kind of scary, but there are two things you need to notice about it:
Looking at the right side of the equation, notice that the deviance is still largely determined by the sum of errors. Thus, all else being equal, smaller errors are going to lead to a smaller deviance.
Notice that size our sample shows up in two different places (the N at the beginning and the fact that we are summing over N on the right hand side). This means that the deviance is sensitive to the amount of data we are using. Thus, if we want to compare models based on their deviance, those models need to be based on the same amount of data.
Now we are ready to make a comparison between some of our different statistical models by comparing the change in the Deviance. Let’s start by comparing the Random Intercepts, Fixed Slopes, and Random Slopes models using the anova() function in R.
anova(raneff_00,raneff_01,raneff_02, raneff_03)
The anova() function gives a number of valuable pieces of information. First, it explicitly lists the models that are being tested. Below, it gives us the model name and the degrees of freedom for each model. It then gives us the Akaike’s Information Criterion (AIC), the related Bayesian Information Criterion (or BIC), the log of the Likelihood (or logLik), and the Deviance.
The Random Intercepts model can act as our “benchmark” against which other models can be compared. For instance, the reduction in deviance from the Random Intercepts (Deviance = 5859.9) to the Random Linear Slopes model (Deviance = 4812.9) is 1041; which is a huge reduction! The change in deviance follows a \(\chi^2\) distribution, which allows us to see if model is a statistical signficant improvement beyond the preceeding model.
The based on the Wald Test of the change in deviance, it appears that all of our models are successive improvements, despite the large increases to the degrees of freedom. The linear model is better than the random-intercepts model, \(\chi^2(3)=1046.98, p < 0.001\). the quadratic model is better than the linear, \(\chi^2(4)=746.46, p < 0.001\), and the cubic model provides an even better fit than the quadratic, \(\chi^2(5)=252.32, p < 0.001\). Thus, we would conclude that the cubic model is the “best” unconditional model (assuming \(\alpha = 0.05\)).
The AIC is another method to evaluating model fit that is based on the Deviance. Although we won’t get into the details of the math behind it, the importance of the AIC is in the name “Information Criterion”. That is, the AIC is all about informativeness which is slightly different from just being the model that produces the smallest deviance. For a model to be informative, the parameters need to generalize to other new datasets, not just provide the best explanation of the current data. Therefore, the AIC introduces a penalty to reduce over-fitting of the model: \[ AIC = Deviance + 2(k) \]
In the formula above, k = number of parameters in the model. Thus, for a parameter to improve the AIC it has to reduce the deviance by >2 times the number of parameters. As mentioned, we won’t get into why the magic number of 2(k) seems to work so well, but the key thing to know is that the AIC imposes a penalty based on the number of parameters and is thus a more conservative test than the Wald Test of the change in Deviance. We can see this in action by comparing our models in the ANOVA output above. Note that the AIC is reliably larger than the simple deviance by a factor of \(2*k\).
Ultimately, we still arrive at the same conclusion because the cubic model has the lowest AIC. However, it is important that we base this decision on the AIC rather than just the Wald Test of the change in deviance. The reason for this comes back to over-fitting. Note that the cubic model uses 15 degrees of freedom, which is a lot more than the simpler linear model that used only 6. Adding 9 extra parameters is likely to reduce the deviance just by chance, so by introducing a penalty for every additional parameter, we can really make sure that these additional parameters are pulling their own weight, so to speak. (For a more detailed explanation of the AIC and it’s unique properties for improving generalizability/reducing over-fitting, see Jeff Long’s excellent book, Longitudinal Data Analysis for the Behavioral Sciences using R).
As a word of caution, there are no fixed cut-offs for a “statistically significant” change in the AIC although some research has been done exploring how the AIC relates to other measures of effect-size (see Long, 2012). In general, it is a good idea to declare a minimum change in the AIC in advance. Keep in mind that any reduction in the AIC means that model is explaining more of the deviance than the complexity it has gained in additional parameters. However, we might want to set a minimum difference of 1 whole point or 2 whole points for selecting a “superior” model. For instance, Anderson (2008, chap. 4) describes a \(\Delta AIC = 4\) as a “strong” difference in the explanatory power of two models and \(\Delta AIC = 8\) as a “very strong” difference. After all, the exact AIC would depend on your data conforming to all of your model’s assumptions. For that reason, setting a minimum cut-off or 1 or 2 AIC points is useful, because your AIC may not be precise to the level of 0.01 or 0.1.
For instance, when the sample size is small, there is a good chance that the AIC will not be penalizing additional parameters enough to avoid overfitting. To address this issue, researchers have developed the AIC-corrected, or \(AICc\). Assuming that your model is univariate in it’s outcome, linear in its parameters, and has an (approximately) normally distributed residuals, then the AICc can be calculated by: \[ AICc = AIC + (2k^2+2k)/(n-k-1) \] Where \(n\) is the sample size and \(k\) is the number of parameters. As such, the AICc is essentially the AIC with an extra penalty on the number of parameters. As the sample size gets larger, the AICc will effectively converge on the AIC.
These rules-of-thumb should not be treated as hard-cutoffs however, and analysts should treat the AIC continuously. I do think that these guidelines are useful, however, because our models will likely violate our assumptions to some degree. Using integer values of the AIC then, builds in a “buffer” to help us make sure that we are select a model that truly explains more variability than it adds in complexity.
Once we have settled on our unconditional model of change over time, we can start to add factors to our model to try explain individual differences in these trajectories.
Specifically, we will add the factor of AIS Grade to our model, as well as interactions between AIS Grade and time.
After creating this conditional model, we can first check to see if it is an improvement above the unconditional models, by comparing their respective AICs using the anova() function.
# Effect of AIS Grade on Time
cond_01<-lmer(rasch_FIM~
# Fixed-effects
1+year.0*AIS_grade+year.0_sq*AIS_grade+year.0_cu*AIS_grade+
# Random-effects
(1+year.0+year.0_sq+year.0_cu|subID), data=DAT2, REML=FALSE,
control=lmerControl(optimizer="bobyqa",
optCtrl=list(maxfun=5e5)))
anova(raneff_03, cond_01)
As this model is an improvement beyond the unconditional models, based on the AIC reduction, we can then delve deeper into the model by passing only the best fitting model to the anova() function, which will give us the omnibus F-test for the different main-effects and interactions (similar to a more traditional ANOVA table).
anova(cond_01)
#Note that I actually prefer to use the Anova() function from the car package, where you can easily specify the type of SS calculation.
Anova(cond_01, type="III")
If we really want to drill down into the model, we can get the specific parameters estimates for the fixed-effects (similar to an OLS regression output) and the variance/covariance estimates for the random-effects by using the summary() function.
summary(cond_01)
## Linear mixed model fit by maximum likelihood . t-tests use Satterthwaite's
## method [lmerModLmerTest]
## Formula: rasch_FIM ~ 1 + year.0 * AIS_grade + year.0_sq * AIS_grade +
## year.0_cu * AIS_grade + (1 + year.0 + year.0_sq + year.0_cu | subID)
## Data: DAT2
## Control: lmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 5e+05))
##
## AIC BIC logLik deviance df.resid
## 3818.5 3923.8 -1886.2 3772.5 697
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.1937 -0.5058 -0.0100 0.5192 3.2274
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## subID (Intercept) 23.247 4.821
## year.0 541.662 23.274 -0.43
## year.0_sq 846.025 29.087 0.28 -0.86
## year.0_cu 128.034 11.315 -0.19 0.73 -0.97
## Residual 6.593 2.568
## Number of obs: 720, groups: subID, 40
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 6.915 1.433 40.111 4.826 2.05e-05 ***
## year.0 84.472 7.133 39.673 11.842 1.34e-14 ***
## AIS_gradeC5-8 5.715 1.889 40.153 3.026 0.00431 **
## AIS_gradeT1-S5 13.108 2.478 39.907 5.289 4.73e-06 ***
## year.0_sq -66.742 9.401 39.414 -7.099 1.48e-08 ***
## year.0_cu 18.676 3.781 39.860 4.939 1.45e-05 ***
## year.0:AIS_gradeC5-8 16.353 9.418 39.926 1.736 0.09020 .
## year.0:AIS_gradeT1-S5 24.377 12.350 39.585 1.974 0.05541 .
## AIS_gradeC5-8:year.0_sq -15.393 12.419 39.729 -1.239 0.22246
## AIS_gradeT1-S5:year.0_sq -22.995 16.241 38.952 -1.416 0.16477
## AIS_gradeC5-8:year.0_cu 5.544 4.997 40.191 1.110 0.27379
## AIS_gradeT1-S5:year.0_cu 8.251 6.511 38.878 1.267 0.21257
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) year.0 AIS_gC5-8 AIS_gT1-S5 yr.0_s yr.0_c y.0:AIS_C
## year.0 -0.522
## AIS_grdC5-8 -0.759 0.396
## AIS_grT1-S5 -0.578 0.302 0.439
## year.0_sq 0.391 -0.889 -0.296 -0.226
## year.0_cu -0.307 0.779 0.233 0.177 -0.977
## y.0:AIS_C5- 0.395 -0.757 -0.523 -0.229 0.674 -0.590
## y.0:AIS_T1- 0.302 -0.578 -0.229 -0.522 0.514 -0.450 0.437
## AIS_grdC5-8:yr.0_s -0.296 0.673 0.392 0.171 -0.757 0.739 -0.890
## AIS_grdT1-S5:yr.0_s -0.226 0.515 0.172 0.391 -0.579 0.565 -0.390
## AIS_grdC5-8:yr.0_c 0.232 -0.590 -0.309 -0.134 0.739 -0.757 0.780
## AIS_grdT1-S5:yr.0_c 0.178 -0.453 -0.135 -0.307 0.567 -0.581 0.343
## y.0:AIS_T AIS_grdC5-8:yr.0_s AIS_grdT1-S5:yr.0_s
## year.0
## AIS_grdC5-8
## AIS_grT1-S5
## year.0_sq
## year.0_cu
## y.0:AIS_C5-
## y.0:AIS_T1-
## AIS_grdC5-8:yr.0_s -0.389
## AIS_grdT1-S5:yr.0_s -0.890 0.438
## AIS_grdC5-8:yr.0_c 0.341 -0.977 -0.428
## AIS_grdT1-S5:yr.0_c 0.780 -0.429 -0.977
## AIS_grdC5-8:yr.0_c
## year.0
## AIS_grdC5-8
## AIS_grT1-S5
## year.0_sq
## year.0_cu
## y.0:AIS_C5-
## y.0:AIS_T1-
## AIS_grdC5-8:yr.0_s
## AIS_grdT1-S5:yr.0_s
## AIS_grdC5-8:yr.0_c
## AIS_grdT1-S5:yr.0_c 0.440
# Negative Exponential Model ----
ggplot(DAT2, aes(x = year.0, y = rasch_FIM)) +
geom_line(aes(group=subID), col="grey") +
#stat_smooth(method="lm", formula=y~x+I(x^2), col="black", lwd=1.5, se=FALSE)+
scale_x_continuous(name = "Time from Admission (Years)") +
scale_y_continuous(name = "Rasch-Scaled FIM Score (0-100)",limits=c(0,100)) +
theme_bw() +
theme(axis.text=element_text(size=14, colour="black"),
axis.title=element_text(size=14,face="bold")) +
theme(strip.text.x = element_text(size = 14))+ theme(legend.position="none")
library(nlme)
##
## Attaching package: 'nlme'
## The following object is masked from 'package:lme4':
##
## lmList
## The following object is masked from 'package:dplyr':
##
## collapse
set.seed(100)
neg_exp_rand_mod <- nlme(rasch_FIM ~ b_1i +
(b_2i)*(exp(b_3i * year.0)),
data = DAT2,
fixed = b_1i + b_2i + b_3i ~ 1,
random = list(b_1i ~ 1, b_2i ~ 1, b_3i ~1),
groups = ~ subID,
start = c(80, -70, -1),
na.action = na.omit)
summary(neg_exp_rand_mod)
## Nonlinear mixed-effects model fit by maximum likelihood
## Model: rasch_FIM ~ b_1i + (b_2i) * (exp(b_3i * year.0))
## Data: DAT2
## AIC BIC logLik
## 3807.147 3852.94 -1893.574
##
## Random effects:
## Formula: list(b_1i ~ 1, b_2i ~ 1, b_3i ~ 1)
## Level: subID
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## b_1i 14.1015539 b_1i b_2i
## b_2i 12.8736817 -0.859
## b_3i 0.8191827 0.564 -0.446
## Residual 2.5385854
##
## Fixed effects: b_1i + b_2i + b_3i ~ 1
## Value Std.Error DF t-value p-value
## b_1i 58.53985 2.256795 678 25.93937 0
## b_2i -48.13958 2.078622 678 -23.15937 0
## b_3i -2.53036 0.142557 678 -17.74984 0
## Correlation:
## b_1i b_2i
## b_2i -0.845
## b_3i 0.549 -0.390
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.16006126 -0.52003742 -0.03391454 0.51671598 3.18389619
##
## Number of Observations: 720
## Number of Groups: 40
coef(neg_exp_rand_mod)
fitted(neg_exp_rand_mod)
## s01 s01 s01 s01 s01 s01 s01 s01
## 13.764352 27.934466 35.407504 41.184922 42.078257 46.315994 47.563770 49.715686
## s01 s01 s01 s01 s01 s01 s01 s01
## 50.528657 51.398616 51.993297 52.487531 52.783724 53.109660 53.174674 53.307447
## s01 s01 s02 s02 s02 s02 s02 s02
## 53.428061 53.541314 30.244074 38.362707 48.800011 51.121223 58.195460 61.421269
## s02 s02 s02 s02 s02 s02 s02 s02
## 63.889070 67.437235 70.814503 72.894807 75.227068 76.414510 76.809006 78.397999
## s02 s02 s02 s02 s03 s03 s03 s03
## 79.323858 79.518195 80.232872 81.113955 14.549918 19.979807 24.902580 33.431055
## s03 s03 s03 s03 s03 s03 s03 s03
## 34.338005 40.966198 42.723403 46.644940 49.219015 50.610522 52.867974 54.173193
## s03 s03 s03 s03 s03 s03 s04 s04
## 56.223147 56.412245 57.600602 59.562758 60.329096 60.695275 5.095486 11.461552
## s04 s04 s04 s04 s04 s04 s04 s04
## 15.885643 22.815686 24.871341 27.108088 30.918740 32.006303 32.943615 34.129009
## s04 s04 s04 s04 s04 s04 s04 s04
## 34.458444 35.162000 35.657291 36.242309 36.400892 36.646868 36.705298 36.866171
## s05 s05 s05 s05 s05 s05 s05 s05
## 12.196615 23.698991 30.143418 37.476730 39.435422 44.391000 47.753785 50.973531
## s05 s05 s05 s05 s05 s05 s05 s05
## 53.577235 54.208976 57.738838 58.357523 59.828996 61.030742 62.254480 63.058206
## s05 s05 s06 s06 s06 s06 s06 s06
## 63.604021 64.659383 8.251514 19.829211 28.973771 31.880547 36.947226 41.113564
## s06 s06 s06 s06 s06 s06 s06 s06
## 45.862767 48.329975 48.616874 50.946646 52.644764 53.329888 54.570805 55.506776
## s06 s06 s06 s06 s07 s07 s07 s07
## 55.558518 56.081722 56.737070 57.110113 18.291027 23.772029 30.285314 42.001974
## s07 s07 s07 s07 s07 s07 s07 s07
## 46.994864 48.751556 53.570222 59.345343 60.925535 63.285062 64.620091 66.079938
## s07 s07 s07 s07 s07 s07 s08 s08
## 68.215205 68.629090 69.589951 70.565157 71.180410 71.910021 13.793077 22.024739
## s08 s08 s08 s08 s08 s08 s08 s08
## 31.329402 38.468716 46.603165 48.710326 54.774412 55.177548 60.661564 60.932627
## s08 s08 s08 s08 s08 s08 s08 s08
## 64.455395 65.228823 67.197019 68.989134 69.590989 70.991273 71.553849 73.348619
## s09 s09 s09 s09 s09 s09 s09 s09
## 8.991320 16.580753 21.984499 25.195499 27.579006 29.083536 29.826362 30.826134
## s09 s09 s09 s09 s09 s09 s09 s09
## 31.639555 32.074422 32.356910 32.680218 32.806441 32.847773 32.942717 32.970991
## s09 s09 s10 s10 s10 s10 s10 s10
## 33.002692 33.035391 6.607821 12.540108 21.712244 25.349807 25.562268 30.973983
## s10 s10 s10 s10 s10 s10 s10 s10
## 32.158823 34.757572 36.348201 37.537681 39.100750 40.278688 41.127427 41.671352
## s10 s10 s10 s10 s11 s11 s11 s11
## 42.375859 43.143026 43.456431 44.000113 20.579124 30.317176 36.992767 39.627631
## s11 s11 s11 s11 s11 s11 s11 s11
## 46.107520 49.138786 54.920058 56.595664 58.915463 60.304935 61.341550 62.342079
## s11 s11 s11 s11 s11 s11 s12 s12
## 64.188048 64.366617 65.429818 66.077495 66.297578 66.898300 17.959919 24.351482
## s12 s12 s12 s12 s12 s12 s12 s12
## 29.510580 37.413891 41.975946 47.588280 50.685882 51.996914 53.159833 53.635376
## s12 s12 s12 s12 s12 s12 s12 s12
## 54.648183 55.127199 55.414979 55.818366 55.946253 56.105235 56.168847 56.284366
## s13 s13 s13 s13 s13 s13 s13 s13
## 4.347424 8.243574 16.837635 19.094207 23.808047 24.735608 27.711123 30.448308
## s13 s13 s13 s13 s13 s13 s13 s13
## 33.440524 34.741158 36.789083 37.155229 38.509361 39.567064 39.929514 40.646307
## s13 s13 s14 s14 s14 s14 s14 s14
## 41.224456 42.444051 9.043923 28.900258 35.723561 43.365184 50.907698 56.352055
## s14 s14 s14 s14 s14 s14 s14 s14
## 63.375624 64.950371 68.760251 69.436557 71.061097 72.764861 73.626443 74.094472
## s14 s14 s14 s14 s15 s15 s15 s15
## 74.702411 75.311946 75.623164 76.243198 24.874252 29.227501 32.463082 36.385513
## s15 s15 s15 s15 s15 s15 s15 s15
## 41.498667 43.543521 46.233580 49.352004 52.058003 52.493136 54.715674 56.210627
## s15 s15 s15 s15 s15 s15 s16 s16
## 57.659029 58.395902 59.059118 59.391076 60.544887 61.270092 16.914006 26.306443
## s16 s16 s16 s16 s16 s16 s16 s16
## 30.406155 35.864796 40.591965 43.913818 48.912314 50.043447 52.911837 53.947733
## s16 s16 s16 s16 s16 s16 s16 s16
## 54.919972 56.206176 56.992219 58.063592 58.283070 58.687878 59.112139 59.475593
## s17 s17 s17 s17 s17 s17 s17 s17
## 28.141252 35.083473 38.998534 42.859358 46.494250 50.861483 52.056171 55.221493
## s17 s17 s17 s17 s17 s17 s17 s17
## 57.264795 58.807734 59.009351 60.798852 61.322787 62.542282 63.110266 63.970032
## s17 s17 s18 s18 s18 s18 s18 s18
## 64.380082 65.517099 15.695580 29.830799 38.895051 41.248125 49.196706 50.961838
## s18 s18 s18 s18 s18 s18 s18 s18
## 54.132626 56.668791 58.212790 59.879007 61.408853 62.295317 62.816135 63.314018
## s18 s18 s18 s18 s19 s19 s19 s19
## 64.029086 64.086294 64.432896 64.848216 18.896161 27.288653 35.424997 38.226881
## s19 s19 s19 s19 s19 s19 s19 s19
## 42.343272 47.928708 50.524625 52.473812 54.410945 56.136194 56.370768 57.652357
## s19 s19 s19 s19 s19 s19 s20 s20
## 59.281218 60.424966 60.559783 61.336716 62.024606 62.347957 21.349592 26.631476
## s20 s20 s20 s20 s20 s20 s20 s20
## 33.484023 39.561736 43.679219 49.195029 50.364684 52.484288 54.856534 55.376390
## s20 s20 s20 s20 s20 s20 s20 s20
## 56.597456 57.310674 58.391076 58.813287 59.103748 59.773994 59.919531 60.232843
## s21 s21 s21 s21 s21 s21 s21 s21
## 18.222565 26.593725 30.311633 38.263060 42.689912 48.121193 50.041722 54.919947
## s21 s21 s21 s21 s21 s21 s21 s21
## 56.555521 58.830325 59.646080 60.753499 62.495963 63.355045 64.475317 65.091371
## s21 s21 s22 s22 s22 s22 s22 s22
## 65.448799 66.647591 13.136849 16.510608 21.662986 25.063604 25.821515 26.753925
## s22 s22 s22 s22 s22 s22 s22 s22
## 27.675036 28.230245 28.539260 28.846696 29.021058 29.135306 29.222536 29.318696
## s22 s22 s22 s22 s23 s23 s23 s23
## 29.334053 29.374324 29.381587 29.402903 12.043871 23.276636 27.901421 34.052665
## s23 s23 s23 s23 s23 s23 s23 s23
## 40.139780 41.462435 43.095881 46.151316 46.810884 47.906700 48.683344 49.073171
## s23 s23 s23 s23 s23 s23 s24 s24
## 49.535800 50.087909 50.250353 50.425371 50.510004 50.647682 7.174801 14.943108
## s24 s24 s24 s24 s24 s24 s24 s24
## 20.918849 24.537000 28.242894 30.530180 32.692814 34.574773 35.857968 37.260359
## s24 s24 s24 s24 s24 s24 s24 s24
## 38.864693 39.644737 40.919965 41.321688 41.862133 42.414352 42.584903 42.977873
## s25 s25 s25 s25 s25 s25 s25 s25
## 9.904840 12.586737 21.005195 25.589630 28.151033 30.093934 31.233860 33.219618
## s25 s25 s25 s25 s25 s25 s25 s25
## 33.474705 34.134605 34.996635 35.333897 35.734164 35.786666 36.055090 36.123403
## s25 s25 s26 s26 s26 s26 s26 s26
## 36.262410 36.281140 22.483463 25.809816 30.597481 35.219280 38.843102 40.958065
## s26 s26 s26 s26 s26 s26 s26 s26
## 43.175591 44.443450 45.327907 46.567169 47.667561 48.617712 49.125075 49.937844
## s26 s26 s26 s26 s27 s27 s27 s27
## 50.228149 50.766822 51.115442 51.157891 12.537166 17.414906 19.573962 25.407519
## s27 s27 s27 s27 s27 s27 s27 s27
## 30.789851 34.352133 36.718952 40.524039 41.045273 42.215137 43.999790 45.197371
## s27 s27 s27 s27 s27 s27 s28 s28
## 45.684319 46.470084 47.134976 47.475509 47.668318 48.269539 6.831689 15.524686
## s28 s28 s28 s28 s28 s28 s28 s28
## 20.840177 25.624519 29.089406 31.773320 32.590789 35.097402 37.039718 38.645741
## s28 s28 s28 s28 s28 s28 s28 s28
## 39.704369 40.526331 41.748233 42.451386 42.845762 43.295338 43.713601 44.161018
## s29 s29 s29 s29 s29 s29 s29 s29
## 11.741826 22.112372 27.251194 31.193346 36.222389 40.020849 42.991326 44.918332
## s29 s29 s29 s29 s29 s29 s29 s29
## 46.984542 48.428521 50.441603 51.982397 53.181716 54.299383 54.558408 55.555162
## s29 s29 s30 s30 s30 s30 s30 s30
## 56.111296 56.856414 1.173582 12.125508 19.328826 23.515080 27.700921 31.947992
## s30 s30 s30 s30 s30 s30 s30 s30
## 37.843413 41.309470 46.353807 51.938355 53.701703 55.054232 57.701865 60.842418
## s30 s30 s30 s30 s31 s31 s31 s31
## 62.838080 64.318421 66.907113 69.235637 22.717005 26.293267 29.794210 35.877539
## s31 s31 s31 s31 s31 s31 s31 s31
## 42.460337 46.654844 50.166251 50.562827 53.813046 57.073312 58.945029 61.162556
## s31 s31 s31 s31 s31 s31 s32 s32
## 62.856412 63.893210 65.993847 66.851432 67.215688 68.706764 14.789145 24.118555
## s32 s32 s32 s32 s32 s32 s32 s32
## 27.496372 36.223398 38.240626 42.482740 45.830403 47.704489 49.466084 51.326714
## s32 s32 s32 s32 s32 s32 s32 s32
## 53.325856 54.411924 54.548896 55.327316 56.255863 56.391160 56.860129 57.261582
## s33 s33 s33 s33 s33 s33 s33 s33
## 10.816721 14.359436 15.544882 19.017192 20.230871 21.325467 22.020325 22.469546
## s33 s33 s33 s33 s33 s33 s33 s33
## 23.229000 23.679772 23.845958 24.293678 24.404544 24.564913 24.645593 24.710858
## s33 s33 s34 s34 s34 s34 s34 s34
## 24.815287 24.890661 19.834765 25.224689 32.427049 37.756608 38.820898 42.527640
## s34 s34 s34 s34 s34 s34 s34 s34
## 47.667544 48.596606 51.829925 52.852833 54.326407 55.105977 56.615806 57.621828
## s34 s34 s34 s34 s35 s35 s35 s35
## 58.374444 58.876623 59.085407 60.038288 18.055419 31.795188 35.950055 45.492644
## s35 s35 s35 s35 s35 s35 s35 s35
## 49.339345 56.183827 61.159479 62.394526 66.230666 67.166567 68.934499 70.643459
## s35 s35 s35 s35 s35 s35 s36 s36
## 71.262334 72.177638 73.134305 73.479922 73.914211 74.600793 6.039154 16.343428
## s36 s36 s36 s36 s36 s36 s36 s36
## 29.654905 30.460264 38.583451 42.222247 45.090236 45.900526 47.928744 49.709314
## s36 s36 s36 s36 s36 s36 s36 s36
## 50.993321 51.761370 52.026835 52.473103 53.118306 53.301143 53.607310 53.869863
## s37 s37 s37 s37 s37 s37 s37 s37
## 25.665850 37.709421 45.885303 49.464800 53.886869 56.979847 59.042832 59.864639
## s37 s37 s37 s37 s37 s37 s37 s37
## 61.239775 62.060880 63.398082 63.832533 64.226395 64.643041 65.076459 65.210587
## s37 s37 s38 s38 s38 s38 s38 s38
## 65.485365 65.608604 11.506561 22.957119 29.844621 37.664670 39.849262 45.687336
## s38 s38 s38 s38 s38 s38 s38 s38
## 47.911509 49.880012 52.952212 53.916669 56.326149 57.217465 59.093220 60.012917
## s38 s38 s38 s38 s39 s39 s39 s39
## 61.045309 61.211271 62.045397 62.332383 7.389218 20.479869 25.602341 32.104898
## s39 s39 s39 s39 s39 s39 s39 s39
## 37.918572 40.306728 44.319506 47.015525 47.711997 50.703404 51.272827 53.855663
## s39 s39 s39 s39 s39 s39 s40 s40
## 54.095478 55.861273 56.011765 57.130527 57.364833 58.144633 11.172811 23.881111
## s40 s40 s40 s40 s40 s40 s40 s40
## 29.875099 35.947116 36.872505 42.888013 45.547699 47.058317 48.321466 50.276576
## s40 s40 s40 s40 s40 s40 s40 s40
## 51.033545 51.604459 52.182166 52.575485 52.752757 52.875346 53.095921 53.262183
## attr(,"label")
## [1] "Fitted values"
year <- seq(from=0, to=1.5, by=0.01)
rasch_FIM <- 58.53985-48.13958*exp(-2.53036*year)
PRED <- data.frame(year, rasch_FIM)
head(DAT2)
ggplot(DAT2, aes(x = year.0, y = rasch_FIM)) +
geom_line(aes(group=subID), col="grey") +
scale_x_continuous(name = "Time from Admission (Years)") +
scale_y_continuous(name = "Independence (0-100)",limits=c(0,100)) +
#facet_wrap(~AIS_grade) +
theme_bw() +
theme(axis.text=element_text(size=14, colour="black"),
axis.title=element_text(size=14,face="bold")) +
theme(strip.text.x = element_text(size = 14))+ theme(legend.position="none")+
geom_line(data=PRED, aes(x=year, y=rasch_FIM), col="black", lwd=1.5)
DAT2$exp_pred <- fitted(neg_exp_rand_mod) # Save model predictions to data frame
first4<-DAT2[c(1:72),] # Second, we'll make a smaller dataset with the predictions for these 10:
ggplot(first4, aes(x = year.0, y = rasch_FIM)) +
geom_line(aes(group=subID)) + facet_wrap(~subID, ncol=2)+
geom_point(fill="grey", pch=21, size=2, stroke=1.25) +
scale_x_continuous(name = "Time from Admission (Years)") +
scale_y_continuous(name = "Independence (0-100)",limits=c(0,100))+
geom_line(aes(y=exp_pred), lwd=1, col="purple") +
theme_bw() + theme(axis.text=element_text(size=12, colour="black"),
axis.title=element_text(size=14,face="bold")) +
theme(strip.text.x = element_text(size = 12))+ theme(legend.position="none")
Note that I am specifying the optimizer that we are using. Specifically, I am using the bobyqa optimizer. bobyqa is short for “Bound Approximation BY Quadratic Approximation”. It is the optimizer that lme4 uses by default, but I am calling it explicitly here. You can select other optimizers using the lmerControl() function: (https://www.rdocumentation.org/packages/lme4/versions/1.1-27/topics/lmerControl).
control=lmerControl(optimizer="Nelder_Mead",
optCtrl=list(maxfun=5e5)))
A good exercise is to run all of the models in this chapter using the Nelder-Mead optimizer and the BOBYQA optimizer. Do these models differ in how they estimate the fixed-effects and standard errors of the fixed-effects? Do these models differ in how they estimate the variances and covariances of the random-effects?
You can also change the number of iterations that your maximum likelihood simulation goes through by changing the maxfun in the optCtrl argument. If you run into convergence warnings, it is a good idea to:
For a fairly accessible discussion of these issues, I would recommend a simulation study by McCoach and colleagues (2018; https://journals.sagepub.com/doi/abs/10.3102/1076998618776348?journalCode=jebb). By simulating various multi-level data structures and then fitting the same mixed-effects models in different software packages, McCoach and colleagues found that the BOBYQA optimizer tended to give more non-convergence warnings than other optimizers/programs even though the estimates from the models were (effectively) the same.
…