Multilevel Models Tutorial

Erin M. Buchanan

Goals

  • Understand what types of data are appropriate for multilevel models
  • Review the basic equations/terminology of regression versus multilevel models
  • Learn how to implement these models in R

Terminology Note

  • Multi-level model
  • Random effects model
  • Mixed model
  • Random coefficient model
  • Hierarchical model

Non-Hierarchical Data

  • Often thought of as “between-subjects” data
  • Each person/data observation has one row of data
  • Where each column represents one variable

Hierarchical Data

  • Hierarchical data is data that contains some form of structure of data points “nested” within a variable
    • A group of student’s exam scores (each person tested multiple times)
    • Prices of healthcare equipment over the last year (same equipment measured over time)
    • Happiness scores for individuals in multiple countries (each country contains multiple measurements of people)

Hierarchical Data

  • In theory, we could create one average score for each of the multiple measurements:
    • A group of student’s exam scores (each person tested multiple times)
      • Each student has one overall exam average
      • Each exam has an overall average
    • Prices of healthcare equipment over the last year (same equipment measured over time)
      • Average yearly price for each equipment piece
    • Happiness scores for individuals in multiple countries (each country contains multiple measurements of people)
      • Average happiness by country

Hierarchical Data

  • However, that’s lame:
    • Reduces power because multiple data points become one score
    • Ignores the measurement error of the multiple data points
    • May hide interesting relationships that exist when points are examined individually

Multilevel Models

  • Multilevel models (MLMs) allow you to retain power and analyze each data point while controlling for non-independence.
  • You can potentially uncover why heteroscedasticity occurs within the analysis
  • You can have missing data but still use other available data points
    • For example, if equipment prices are missing from May, you could model the data even without those numbers
  • Control for measurement error due to the study design or measurement tools

Regression Equation

  • A simple equation with one predictor:

\[\hat{Y_i} = b_0 + b_1X_i + \epsilon_i\]

  • \(\hat{Y_i}\): The predicted dependent variable score for participant \(i\)
  • \(b_0\): The y-intercept, average score for Y when X is zero
  • \(b_1\): The slope for the first predictor
    • For every one unit increase in X, we see \(b\) increases in Y
  • \(X_i\): The score on the first predictor for participant \(i\)
  • \(\epsilon_i\): The residual or error for participant \(i\)
    • The actual \(Y_i\) minus the predicted \(\hat{Y_i}\)

Modeling

  • MLM is still regression - and we can use linear, logistic, etc.
  • We generally use the same system as examining regression:
    • An overall model
    • Examine individual predictors
  • However, we need to add a new concept: random effects
    • Fixed coefficients: Intercepts/slopes are assumed to be the same across different contexts
    • Random coefficients: Intercepts/slopes are allowed to vary across different contexts

Random Coefficients

  • Random variable: a grouping or clustering variable
    • You set this variable up based on the structure of the data
    • Usually the participants or data points that are measured multiple times
  • A group of student’s exam scores (each person tested multiple times)
    • Students would be the random variable
  • Prices of healthcare equipment over the last year (same equipment measured over time)
    • Equipment is most likely, but could also be time depending on hypothesis
  • Happiness scores for individuals in multiple countries (each country contains multiple measurements of people)
    • Country

MLM Terminology

  • Level 1 versus Level 2, 3, …
  • Examples help: Predicting a language test score ~ gender_student + school_type
  • Dependent variable: language test score
  • Independent variables: gender of student and school type
  • Cluster variable: school number (multiple schools within school type)
  • Level 1: gender of student (each test score is matched to a gender)
  • Level 2: school type (multiple language scores matched to a school type)

MLM Regression Equation

  • Level 1

\[\hat{Y_{ij}} = b_{0j} + b_{1j}X_{ij} + \epsilon_{ij}\]

  • \(\hat{Y_{ij}}\): predicted score for the dependent variable for participant \(i\) for Level 2 \(j\)
  • \(b_{0j}\): Intercept
  • \(b_{1j}\): Slope for level 1 predictor
  • \(X_{ij}\): Level 1 predictor score for participant \(i\) who is in Level 2 \(j\)
  • \(\epsilon_{ij}\): Error for participant \(i\) who is in Level 2 \(j\)

MLM Regression Equation

  • Level 2

\[b_{0j} = \gamma_{00} + \gamma_{01}X_j + u_{0j}\]

  • \(X_j\): Level 2 predictor
  • \(\gamma_{00}\): Overall intercept: grand mean of the dependent variable
  • \(\gamma_{01}\): Effect of Level 2 predictor on level 1 intercept
  • \(u_{0j}\): Difference in Level 2 \(j\) from the overall intercept

\[b_{1j} = \gamma_{10} + \gamma_{11}X_j + u_{1j}\]

  • \(\gamma_{10}\): average slope of the Level 1 predictor
  • \(\gamma_{11}\): Effect of the Level 2 predictor on the level 1 slope
  • \(u_{1j}\): Difference in Level 2 \(j\) from the overall slope

Random Coefficients

  • Random intercept: allows each of the random cluster variables to have a different intercept
    • Remember that the intercept is the average score for Y if X(s) are zero
    • Therefore, effectively this is the average of each cluster separately
    • The values shown in the output are the deviation from the overall non-random intercept

Random Coefficients

knitr::include_graphics("images/12_img_1.png")

Random Coefficients

  • Random slope: allows each of the random cluster variables to have a different slope for the target coefficient
    • You can just do one random slope (for just one X variable) or many of them
    • This output creates a different slope for each of the clusters

Random Coefficients

knitr::include_graphics("images/12_img_2.png")

Random Coefficients

knitr::include_graphics("images/12_img_3.png")

What Should We Use?

  • Models are generally built from the simplest to the most complex
    • Null model or intercept only model
    • Random intercept model + below
    • Fixed effects model + below
    • Random slopes + below (if you want)
  • But sometimes people like to do it the other way
  • Compare with AIC

Packages

library(rio) # importing data
library(nlme) # other people love lmer4
library(dplyr) # best data manipulation package
library(performance) # easy stats is great
library(parameters) # easy stats is great 
library(report) # easy stats is great 
# also recommend library(see)
library(MuMIn) # effect sizes 

Example 1: Priming

  • A good number of psychology projects use repeated measures data
  • This study is a large scale semantic priming study: https://osf.io/preprints/osf/q4fjy
  • The data includes the priming trials from Serbian
  • Dependent variable: Response latency
  • Independent variable: Word pair condition (Level 1)
  • Clustering variables: Participant, Item Trial

Example 1: Priming Data

DF <- import("data/sr_prime_trials.csv") %>% 
  # only variables we need
  select(observation, target_duration, word_combo, 
         keep_target, keep_participant, type) %>% 
  # remove trials and people who don't meet inclusion rules
  filter(keep_target == "keep") %>% 
  filter(keep_participant == "keep") %>% 
  na.omit()

head(DF)
     observation target_duration   word_combo keep_target keep_participant
1 5c86ebfb2-e609         575.544 авенијаручак        keep             keep
2 be68c9e68-9040         662.900 авенијаручак        keep             keep
3 c70a8bb1f-d793         488.900 авенијаручак        keep             keep
4 77c9da84f-a097         400.000 авенијаручак        keep             keep
5 c4244b8de-0681         609.000 авенијаручак        keep             keep
6 e9b8b9c54-a976         576.962 авенијаручак        keep             keep
       type
1 unrelated
2 unrelated
3 unrelated
4 unrelated
5 unrelated
6 unrelated

Example 1: Priming Data

# number of participants
length(unique(DF$observation))
[1] 681
# number of items 
length(unique(DF$word_combo))
[1] 2000
# average number of items per participant
DF %>% 
  group_by(observation) %>% 
  summarize(n_trials = n()) %>% 
  ungroup() %>% 
  summarize(mean_trial = mean(n_trials),
            sd_trial = sd(n_trials))
# A tibble: 1 × 2
  mean_trial sd_trial
       <dbl>    <dbl>
1       140.     21.5

Example 1: Intercept Only Model

  • gls is akin to lm using generalized least squares
intercept.model <- gls(target_duration ~ 1, 
                       data = DF, 
                       method = "ML", 
                       na.action = "na.omit")

summary(intercept.model)
Generalized least squares fit by maximum likelihood
  Model: target_duration ~ 1 
  Data: DF 
      AIC     BIC    logLik
  1362406 1362425 -681201.1

Coefficients:
               Value Std.Error t-value p-value
(Intercept) 725.0498 0.9731638 745.044       0

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-1.8659624 -0.6086932 -0.2800019  0.2603637  7.4749990 

Residual standard error: 300.8902 
Degrees of freedom: 95598 total; 95597 residual

Example 1: Random Intercept Model

  • nlme::lme versus lme4::lmer
    • Random is added directly to the equation
    • target_duration ~ (1|observation) + (1|word_combo)
random.intercept.model <- lme(target_duration ~ 1, 
                              data = DF, 
                              method = "ML", 
                              na.action = "na.omit",
                              # this is the random intercept part 
                              random = list(~1|observation,
                                            ~1|word_combo))

summary(random.intercept.model)
Linear mixed-effects model fit by maximum likelihood
  Data: DF 
      AIC     BIC    logLik
  1343359 1343397 -671675.5

Random effects:
 Formula: ~1 | observation
        (Intercept)
StdDev:    135.5622

 Formula: ~1 | word_combo %in% observation
        (Intercept) Residual
StdDev:    141.7564 228.5829

Fixed effects:  target_duration ~ 1 
               Value Std.Error    DF  t-value p-value
(Intercept) 725.6796  5.273002 94343 137.6217       0

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-3.3912147 -0.4694673 -0.1845673  0.2214105  7.1559565 

Number of Observations: 95598
Number of Groups: 
                observation word_combo %in% observation 
                        681                       95024 

Example 1: Fixed Effects Model

fixed.model <- lme(target_duration ~ 1 + type,
                   data = DF, 
                   method = "ML", 
                   na.action = "na.omit",
                   random = list(~1|observation,
                                 ~1|word_combo))

summary(fixed.model)
Linear mixed-effects model fit by maximum likelihood
  Data: DF 
      AIC     BIC    logLik
  1342544 1342591 -671266.9

Random effects:
 Formula: ~1 | observation
        (Intercept)
StdDev:    135.6001

 Formula: ~1 | word_combo %in% observation
        (Intercept) Residual
StdDev:     139.747 228.4602

Fixed effects:  target_duration ~ 1 + type 
                 Value Std.Error    DF   t-value p-value
(Intercept)   700.9499  5.343982 94342 131.16623       0
typeunrelated  49.7186  1.735545 94342  28.64723       0
 Correlation: 
              (Intr)
typeunrelated -0.162

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-3.3414581 -0.4729406 -0.1829942  0.2230444  7.2947095 

Number of Observations: 95598
Number of Groups: 
                observation word_combo %in% observation 
                        681                       95024 

Example 1: Model Comparison

compare_performance(intercept.model, random.intercept.model,
                    fixed.model)
# Comparison of Model Performance Indices

Name                   | Model |   AIC (weights) |   BIC (weights) |    RMSE |   Sigma |  AICc (weights) |        R2
--------------------------------------------------------------------------------------------------------------------
intercept.model        |   gls | 1.4e+06 (<.001) | 1.4e+06 (<.001) | 300.890 | 300.890 |         (>.999) | 4.441e-16
random.intercept.model |   lme | 1.3e+06 (<.001) | 1.3e+06 (<.001) | 193.686 | 228.583 | 1.3e+06 (>.999) |          
fixed.model            |   lme | 1.3e+06 (>.999) | 1.3e+06 (>.999) | 194.311 | 228.460 | 1.3e+06 (>.999) |          
AIC(intercept.model)
[1] 1362406
AIC(random.intercept.model) # lower 
[1] 1343359
AIC(fixed.model) # lower
[1] 1342544

Example 1: Model Assumption Checks

# check_model(fixed.model)
# check_outliers(fixed.model)

# if data is large, check_model is slow 
residuals <- scale(residuals(fixed.model))
fitted <- scale(fitted.values(fixed.model))

hist(residuals)

{ qqnorm(residuals); abline(0,1) } 

{ plot(residuals, fitted); abline(h = 0); abline(v = 0) }

Example 1: Effect Size

r.squaredGLMM(fixed.model)
             R2m       R2c
[1,] 0.006811159 0.4247235

Example 1: Examine Results

model_parameters(fixed.model, summary = TRUE)
# Fixed Effects

Parameter        | Coefficient |   SE |           95% CI | t(94342) |      p
----------------------------------------------------------------------------
(Intercept)      |      700.95 | 5.34 | [690.48, 711.42] |   131.17 | < .001
type [unrelated] |       49.72 | 1.74 | [ 46.32,  53.12] |    28.65 | < .001

# Random Effects

Parameter                   | Coefficient
-----------------------------------------
SD (Intercept: observation) |      135.60
SD (Intercept: word_combo)  |      139.75
SD (Residual)               |      228.46

Model: target_duration ~ 1 + type (95598 Observations)
Residual standard deviation: 228.460 (df = 94342)
: 
# plot(model_parameters(fixed.model, summary = TRUE))
standardize_parameters(fixed.model)
# Standardization method: refit

Parameter        | Std. Coef. |         95% CI
----------------------------------------------
(Intercept)      |      -0.08 | [-0.11, -0.05]
type [unrelated] |       0.17 | [ 0.15,  0.18]
report(fixed.model)
We fitted a linear mixed model (estimated using ML and nlminb optimizer) to
predict target_duration with type (formula: target_duration ~ 1 + type). The
model included observation as random effects (formula: list(~1 | observation,
~1 | word_combo)). The model's intercept, corresponding to type = related, is
at 700.95 (95% CI [690.48, 711.42], t(94342) = 131.17, p < .001). Within this
model:

  - The effect of type [unrelated] is statistically significant and positive
(beta = 49.72, 95% CI [46.32, 53.12], t(94342) = 28.65, p < .001; Std. beta =
0.17, 95% CI [0.15, 0.18])

Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald t-distribution approximation.

Example 2: Daily Diary Study

  • Participants were given daily surveys to to measure their negative affect and stress across time
  • Dependent variable: Stress scores
  • Independent variable:
    • Level 1: Centered PANAS Negative Score (within)
    • Level 2: Mean PANAS Score across time (between)
  • Clustering variable(s): Participant, time

Example 2: Daily Diary Data

DF <- import("data/panas_data.csv") %>% 
  na.omit()

head(DF)
  progress_total panas_negative stress_total                           partno
1             23             15            1 dc990a2ce52ba0b2d2d766e43c5071d6
2             23             15            1 dc990a2ce52ba0b2d2d766e43c5071d6
3             25             14            1 dc990a2ce52ba0b2d2d766e43c5071d6
4             26             14            1 dc990a2ce52ba0b2d2d766e43c5071d6
5             25             16            1 dc990a2ce52ba0b2d2d766e43c5071d6
6             30             15            1 dc990a2ce52ba0b2d2d766e43c5071d6
  realtime
1     0.00
2     1.00
3     1.99
4     2.99
5     3.98
6     4.99

Example 2: Daily Diary Data

# number of participants
length(unique(DF$partno))
[1] 113
# average number of time points per participant
DF %>% 
  group_by(partno) %>% 
  summarize(n_times = n()) %>% 
  ungroup() %>% 
  summarize(mean_times = mean(n_times),
            sd_times = sd(n_times),
            min_times = min(n_times),
            max_times = max(n_times))
# A tibble: 1 × 4
  mean_times sd_times min_times max_times
       <dbl>    <dbl>     <int>     <int>
1       12.6     2.64         1        15

Example 2: Intercept Only Model

intercept.model <- gls(stress_total ~ 1, 
                       data = DF, 
                       method = "ML", 
                       na.action = "na.omit")

summary(intercept.model)
Generalized least squares fit by maximum likelihood
  Model: stress_total ~ 1 
  Data: DF 
       AIC      BIC    logLik
  4876.129 4886.646 -2436.065

Coefficients:
               Value  Std.Error  t-value p-value
(Intercept) 1.528169 0.03571252 42.79085       0

Standardized residuals:
       Min         Q1        Med         Q3        Max 
-1.1359513 -0.3926099 -0.3926099 -0.3926099  7.7841455 

Residual standard error: 1.345277 
Degrees of freedom: 1420 total; 1419 residual

Example 2: Random Intercept Model

random.intercept.model <- lme(stress_total ~ 1, 
                              data = DF, 
                              method = "ML", 
                              na.action = "na.omit",
                              # this is the random intercept part 
                              random = list(~1|partno))

summary(random.intercept.model)
Linear mixed-effects model fit by maximum likelihood
  Data: DF 
       AIC      BIC    logLik
  4456.026 4471.802 -2225.013

Random effects:
 Formula: ~1 | partno
        (Intercept) Residual
StdDev:   0.8244337  1.06616

Fixed effects:  stress_total ~ 1 
              Value  Std.Error   DF  t-value p-value
(Intercept) 1.55362 0.08321221 1307 18.67058       0

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-4.30454994 -0.29479819 -0.11525706 -0.05541002  8.60744658 

Number of Observations: 1420
Number of Groups: 113 

Example 2: Fixed Effects Creation

negative.mean <- aggregate(DF$panas_negative, 
                        by = list(DF$partno), 
                        mean, na.rm = TRUE)
names(negative.mean) <- c("partno", "negative.mean")
DF <- merge(DF, negative.mean, by = "partno")
DF$negative.cnt <- DF$panas_negative - DF$negative.mean

Example 2: Fixed Effects Model

fixed.model <- lme(stress_total ~ 1 + negative.mean + negative.cnt,
                   data = DF, 
                   method = "ML", 
                   na.action = "na.omit",
                   random = list(~1|partno))

summary(fixed.model)
Linear mixed-effects model fit by maximum likelihood
  Data: DF 
       AIC      BIC    logLik
  4316.113 4342.405 -2153.056

Random effects:
 Formula: ~1 | partno
        (Intercept) Residual
StdDev:   0.7572297 1.016058

Fixed effects:  stress_total ~ 1 + negative.mean + negative.cnt 
                  Value  Std.Error   DF   t-value p-value
(Intercept)   0.5313282 0.24342615 1306  2.182708  0.0292
negative.mean 0.0556071 0.01257117  111  4.423381  0.0000
negative.cnt  0.0657026 0.00572279 1306 11.480883  0.0000
 Correlation: 
              (Intr) ngtv.m
negative.mean -0.949       
negative.cnt   0.000  0.000

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-4.0022728 -0.3932228 -0.0945899  0.1009836  8.5236045 

Number of Observations: 1420
Number of Groups: 113 

Example 2: Random Slope Model

random.slope.model <- lme(stress_total ~ 1 + negative.mean + negative.cnt,
                   data = DF, 
                   method = "ML", 
                   na.action = "na.omit",
                   random = list(~realtime|partno))

summary(random.slope.model)
Linear mixed-effects model fit by maximum likelihood
  Data: DF 
       AIC      BIC    logLik
  4174.756 4211.565 -2080.378

Random effects:
 Formula: ~realtime | partno
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev    Corr  
(Intercept) 1.3277461 (Intr)
realtime    0.1075518 -0.893
Residual    0.9187873       

Fixed effects:  stress_total ~ 1 + negative.mean + negative.cnt 
                  Value  Std.Error   DF   t-value p-value
(Intercept)   0.6321300 0.21288561 1306  2.969341   3e-03
negative.mean 0.0377227 0.01102278  111  3.422248   9e-04
negative.cnt  0.0603304 0.00550876 1306 10.951731   0e+00
 Correlation: 
              (Intr) ngtv.m
negative.mean -0.949       
negative.cnt   0.028 -0.016

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.02732595 -0.38036935 -0.05720288  0.12664612  8.92243129 

Number of Observations: 1420
Number of Groups: 113 

Example 2: Model Comparison

compare_performance(intercept.model, random.intercept.model,
                    fixed.model, random.slope.model)
# Comparison of Model Performance Indices

Name                   | Model |  AIC (weights) |  BIC (weights) |  RMSE | Sigma | AICc (weights) | R2 (cond.) | R2 (marg.) |   ICC |        R2
-----------------------------------------------------------------------------------------------------------------------------------------------
intercept.model        |   gls | 4876.1 (<.001) | 4886.6 (<.001) | 1.345 | 1.345 |        (>.999) |            |            |       | 3.331e-15
random.intercept.model |   lme | 4456.0 (<.001) | 4471.8 (<.001) | 1.029 | 1.066 | 4456.0 (>.999) |      0.374 |      0.000 | 0.374 |          
fixed.model            |   lme | 4316.1 (<.001) | 4342.4 (<.001) | 0.981 | 1.016 | 4316.2 (>.999) |      0.432 |      0.116 | 0.357 |          
random.slope.model     |   lme | 4174.8 (>.999) | 4211.6 (>.999) | 0.863 | 0.919 | 4174.8 (>.999) |      0.692 |      0.049 | 0.676 |          
AIC(intercept.model)
[1] 4876.129
AIC(random.intercept.model) # lower 
[1] 4456.026
AIC(fixed.model) # lower
[1] 4316.113
AIC(random.slope.model) # lower
[1] 4174.756

Example 2: Model Assumption Checks

check_model(random.slope.model)
check_outliers(random.slope.model)
OK: No outliers detected.
- Based on the following method and threshold:  ().
- For variable: (Whole model)

Example 2: Effect Size

r.squaredGLMM(random.slope.model)
           R2m       R2c
[1,] 0.0751046 0.5254546

Example 2: Examine Results

model_parameters(random.slope.model, summary = TRUE)
# Fixed Effects

Parameter     | Coefficient |       SE |       95% CI |     t |   df |      p
-----------------------------------------------------------------------------
(Intercept)   |        0.63 |     0.21 | [0.21, 1.05] |  2.97 | 1306 | 0.003 
negative mean |        0.04 |     0.01 | [0.02, 0.06] |  3.42 |  111 | < .001
negative cnt  |        0.06 | 5.51e-03 | [0.05, 0.07] | 10.95 | 1306 | < .001

# Random Effects

Parameter                        | Coefficient
----------------------------------------------
SD (Intercept: partno)           |        1.33
SD (realtime: partno)            |        0.11
Cor (Intercept~realtime: partno) |       -0.89
SD (Residual)                    |        0.92

Model: stress_total ~ 1 + negative.mean + negative.cnt (1420 Observations)
Residual standard deviation: 0.919 (df = 1306)
Conditional R2: 0.692; Marginal R2: 0.049
# plot(model_parameters(fixed.model, summary = TRUE))

report(random.slope.model)
We fitted a linear mixed model (estimated using ML and nlminb optimizer) to
predict stress_total with negative.mean and negative.cnt (formula: stress_total
~ 1 + negative.mean + negative.cnt). The model included realtime as random
effects (formula: list(~realtime | partno)). The model's total explanatory
power is substantial (conditional R2 = 0.69) and the part related to the fixed
effects alone (marginal R2) is of 0.05. The model's intercept, corresponding to
negative.mean = 0 and negative.cnt = 0, is at 0.63 (95% CI [0.21, 1.05],
t(1306) = 2.97, p = 0.003). Within this model:

  - The effect of negative mean is statistically significant and positive (beta =
0.04, 95% CI [0.02, 0.06], t(111) = 3.42, p < .001; Std. beta = 0.17, 95% CI
[0.07, 0.27])
  - The effect of negative cnt is statistically significant and positive (beta =
0.06, 95% CI [0.05, 0.07], t(1306) = 10.95, p < .001; Std. beta = 0.21, 95% CI
[0.17, 0.25])

Standardized parameters were obtained by fitting the model on a standardized
version of the dataset. 95% Confidence Intervals (CIs) and p-values were
computed using a Wald t-distribution approximation.

Other Resources

  • Free book on MLM: https://www.learn-mlms.com/
  • Tutorial on simulating nested data: https://debruine.github.io/faux/articles/sim_mixed.html#adding-random-effects
  • Gelman paper: http://www.stat.columbia.edu/~gelman/research/published/multi2.pdf

Summary

  • Multilevel models are flexible tools for structured/hierarchical data
  • Regressions within regressions in understanding the equations/calculations
  • R makes these easy to run, easystats is still awesome
  • Thank you for your time on the weekend!