LMM Reference

Quick lookup for LMM syntax, diagnostics, parameters, and troubleshooting.

LMMquick-referencesyntax

LMM Quick Reference

Fast lookup for syntax, diagnostics, and troubleshooting. For step-by-step learning, see the Walkthrough. For conceptual background, see the Overview.

Jump to: Syntax · ML vs. REML · Extract Output · Model Comparison · Diagnostics · Parameters · Errors & Fixes · Troubleshooting · Pitfalls · Formulas · Extensions · Resources


lme4 Syntax

Random Intercept Only

mod <- lmer(y ~ time + (1 | id), data = data_long)

Random Intercept and Slope (Correlated)

mod <- lmer(y ~ time + (1 + time | id), data = data_long)

Random Intercept and Slope (Uncorrelated)

mod <- lmer(y ~ time + (1 | id) + (0 + time | id), data = data_long)

With Between-Person Predictor

mod <- lmer(y ~ time + treatment + (1 + time | id), data = data_long)

Cross-Level Interaction

mod <- lmer(y ~ time * treatment + (1 + time | id), data = data_long)

Quadratic Time

mod <- lmer(y ~ time + I(time^2) + (1 + time | id), data = data_long)

Random Slope Only (No Random Intercept)

mod <- lmer(y ~ time + (0 + time | id), data = data_long)

Nested Random Effects (Three-Level Data)

Use when you have clustering at multiple levels (e.g., observations within students within schools):

# Three-level: observations nested in students nested in schools
mod <- lmer(y ~ time + (1 + time | school/id), data = data_long)

# Equivalent explicit syntax
mod <- lmer(y ~ time + (1 + time | school) + (1 + time | school:id), data = data_long)

Note: For standard longitudinal data (observations within persons only), use the simpler (1 + time | id) syntax shown above.


Estimation: ML vs. REML

MethodUse WhenCode
REML (default)Final model, reporting variancesREML = TRUE
MLComparing models with different fixed effectsREML = FALSE

REML (Default)

mod <- lmer(y ~ time + (1 + time | id), data = data_long)
# or explicitly:
mod <- lmer(y ~ time + (1 + time | id), data = data_long, REML = TRUE)

ML (For Model Comparison)

mod <- lmer(y ~ time + (1 + time | id), data = data_long, REML = FALSE)

Workflow

  1. Compare models with REML = FALSE
  2. Select best model via anova() or AIC/BIC
  3. Refit final model with REML = TRUE for reporting

Extract Output

Summary

summary(mod)

Fixed Effects

fixef(mod)

Fixed Effects with Confidence Intervals

confint(mod, parm = "beta_", method = "Wald")
# or profile likelihood (slower, more accurate):
confint(mod, parm = "beta_", method = "profile")

Variance Components

VarCorr(mod)

Random Effects (BLUPs)

ranef(mod)$id

Fitted Values

fitted(mod)

Residuals

resid(mod)

R² (Marginal and Conditional)

library(performance)
r2(mod)
  • Marginal R²: Variance explained by fixed effects
  • Conditional R²: Variance explained by fixed + random effects

Intraclass Correlation (ICC)

library(performance)
icc(mod)

Coefficients Table with p-values

library(lmerTest)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
summary(mod)  # Now includes p-values

Model Comparison

Likelihood Ratio Test (Nested Models)

# Must use ML for comparing different fixed effects
mod1 <- lmer(y ~ time + (1 + time | id), data = data_long, REML = FALSE)
mod2 <- lmer(y ~ time + treatment + (1 + time | id), data = data_long, REML = FALSE)

anova(mod1, mod2)

Comparing Random Effects Structures

# Can use REML when fixed effects are identical
mod_ri <- lmer(y ~ time + (1 | id), data = data_long)
mod_rs <- lmer(y ~ time + (1 + time | id), data = data_long)

anova(mod_ri, mod_rs)

Information Criteria

AIC(mod1, mod2)  # Lower = better
BIC(mod1, mod2)  # Lower = better; more conservative

Boundary Test Note

When testing whether a variance = 0 (e.g., random slope variance), the LRT is a boundary test. The reported p-value is conservative—the true p-value is approximately half.


Diagnostics

Residuals vs. Fitted

plot(mod)

Look for: Random scatter around 0. Problematic: Funnel shape, curves.

Q-Q Plot for Residuals

qqnorm(resid(mod))
qqline(resid(mod))

Look for: Points on line. Problematic: Heavy tails, skewness.

Q-Q Plot for Random Effects

# Random intercepts
qqnorm(ranef(mod)$id[, "(Intercept)"])
qqline(ranef(mod)$id[, "(Intercept)"])

# Random slopes
qqnorm(ranef(mod)$id[, "time"])
qqline(ranef(mod)$id[, "time"])

Check for Singular Fit

isSingular(mod)  # TRUE = problem

Inspect Variance Components

VarCorr(mod)
# Check for variances at or near zero

Model Performance Summary

library(performance)
check_model(mod)  # Comprehensive diagnostic plots

Parameters

ParameterSymbollme4 Location
Average interceptγ₀₀Fixed effects: (Intercept)
Average slopeγ₁₀Fixed effects: time
Intercept varianceτ₀₀Random effects: id (Intercept)
Slope varianceτ₁₁Random effects: id time
I-S correlationρRandom effects: Corr
Residual varianceσ²Residual

Reading lme4 Output

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 id       (Intercept) 98.54    9.93          <- τ₀₀
          time         0.89    0.94     -0.18 <- τ₁₁, ρ
 Residual             24.89    4.99          <- σ²

Fixed effects:
            Estimate Std. Error ...
(Intercept)  49.923   0.732     <- γ₀₀
time          2.021   0.085     <- γ₁₀

Common Errors & Fixes

IssueSymptomFix
Singular fitboundary (singular) fitSimplify random effects; check VarCorr()
Time as factor4 dummy coefficients, not 1 slopetime <- as.numeric(time)
REML comparisonInvalid LRT for fixed effectsUse REML = FALSE
Convergence failure"Model failed to converge"Simplify model; scale predictors; increase iterations
Missing p-valuesNo p-values in summaryLoad lmerTest before fitting
Data not long formatError about missing variablesReshape with pivot_longer()

Troubleshooting

Singular Fit

Warning: boundary (singular) fit: see help('isSingular')

What it means: A variance component is estimated at or very near zero.

Diagnose:

VarCorr(mod)  # Check which variance is near zero

Solutions:

  1. Remove the problematic random effect
  2. Use uncorrelated random effects: (1 | id) + (0 + time | id)
  3. Accept if theoretically justified (variance genuinely near zero)

Convergence Failure

Warning: "Model failed to converge" or "convergence code: -1"

Solutions:

# Increase iterations
mod <- lmer(y ~ time + (1 + time | id), data = data_long,
            control = lmerControl(optCtrl = list(maxfun = 20000)))

# Try different optimizer
mod <- lmer(y ~ time + (1 + time | id), data = data_long,
            control = lmerControl(optimizer = "bobyqa"))

# Scale predictors
data_long$time_scaled <- scale(data_long$time)

BLUPs in Secondary Analyses

Problem: Using ranef() output as data in follow-up analyses.

Why it's wrong: BLUPs have uncertainty; treating them as fixed underestimates SEs.

Solution: Include predictors in the original model:

# Wrong
re <- ranef(mod)$id
cor.test(re$`(Intercept)`, external_variable)

# Right
mod <- lmer(y ~ time + external_variable + (1 + time | id), data = data_long)

No p-values in Output

Solution: Load lmerTest before fitting:

library(lmerTest)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
summary(mod)  # Now includes p-values

Negative Variance Estimate

Unlike lavaan, lme4 constrains variances to be non-negative. If you see a variance at exactly zero, it's hitting the boundary—see Singular Fit above.


Interpretation Pitfalls

Pitfall
These mistakes are common but avoidable:

MistakeReality
"ICC = 0.65 means 65% explained"ICC is variance partitioning, not variance explained
"Slope variance is significant, so it matters"With large N, tiny variances are significant; interpret effect size
"Everyone improved" (slope mean = 2)Check slope SD—some individuals may have negative slopes
"BLUPs are observed data"BLUPs are shrunken estimates with uncertainty
"Marginal R² is low, model is bad"Conditional R² captures individual differences
"Random effects are normally distributed"This is an assumption—check Q-Q plots
"Intercept = baseline"Only if time is coded 0 at baseline
"Use REML for all comparisons"Use ML when fixed effects differ

Quick Formulas

ICC (from random intercept model)

# Manual calculation
vc <- as.data.frame(VarCorr(mod_ri))
tau00 <- vc$vcov[1]  # Intercept variance
sigma2 <- vc$vcov[2]  # Residual variance
icc <- tau00 / (tau00 + sigma2)

# Or use performance package
performance::icc(mod_ri)

Intercept-Slope Correlation

# From VarCorr
vc <- VarCorr(mod)
attr(vc$id, "correlation")[1, 2]

# Manual from covariance
tau01 <- attr(vc$id, "cov")[1, 2]
tau00 <- vc$id[1, 1]
tau11 <- vc$id[2, 2]
rho <- tau01 / sqrt(tau00 * tau11)

Proportion with Positive Slopes

gamma10 <- fixef(mod)["time"]
tau11 <- VarCorr(mod)$id["time", "time"]

pnorm(0, mean = gamma10, sd = sqrt(tau11), lower.tail = FALSE)

Person-Specific Trajectory

re <- ranef(mod)$id
person_i <- 5

intercept_i <- fixef(mod)["(Intercept)"] + re[person_i, "(Intercept)"]
slope_i <- fixef(mod)["time"] + re[person_i, "time"]

Predicted Value for Person i at Time t

y_hat <- intercept_i + slope_i * t

Advanced Extensions

Time-Varying Covariates

mod <- lmer(mood ~ time + stress + (1 + time | id), data = data_long)

Centering for within/between separation:

data_long <- data_long %>%
  group_by(id) %>%
  mutate(
    stress_between = mean(stress, na.rm = TRUE),
    stress_within = stress - stress_between
  )

mod <- lmer(mood ~ time + stress_within + stress_between + (1 + time | id),
            data = data_long)

Cross-Level Interactions

# Does treatment moderate the slope?
mod <- lmer(y ~ time * treatment + (1 + time | id), data = data_long)

Quadratic Growth

mod <- lmer(y ~ time + I(time^2) + (1 + time | id), data = data_long)

Note: Random quadratic requires many observations per person.

Autocorrelated Residuals (nlme)

library(nlme)

mod <- lme(y ~ time,
           random = ~ 1 + time | id,
           correlation = corAR1(form = ~ time | id),
           data = data_long)

Generalized Linear Mixed Models (GLMM)

# Binary outcome
mod <- glmer(binary_y ~ time + (1 | id), data = data_long, family = binomial)

# Count outcome
mod <- glmer(count_y ~ time + (1 | id), data = data_long, family = poisson)

Bayesian Mixed Models (brms)

library(brms)

mod <- brm(y ~ time + (1 + time | id),
           data = data_long,
           family = gaussian(),
           chains = 4, cores = 4)

Resources

Books

BookFocus
Singer & Willett (2003). Applied Longitudinal Data Analysis. Oxford.Excellent pedagogy
Snijders & Bosker (2012). Multilevel Analysis. Sage.Comprehensive MLM
Raudenbush & Bryk (2002). Hierarchical Linear Models. Sage.Classic reference
Gelman & Hill (2007). Data Analysis Using Regression and Multilevel Models. Cambridge.Practical, Bayesian-friendly
West, Welch, & Galecki (2014). Linear Mixed Models. CRC.Software-focused

Key Articles

ArticleContribution
Bates et al. (2015). Fitting linear mixed-effects models using lme4. JSS.Definitive lme4 reference
Barr et al. (2013). Random effects structure for confirmatory testing. JML."Keep it maximal" argument
Matuschek et al. (2017). Balancing Type I error and power. JML.Counter to maximal approach
Luke (2017). Evaluating significance in LMMs in R. BRM.p-value methods comparison

Online

  • lme4 vignette: vignette("lmer", package = "lme4")
  • UCLA IDRE: https://stats.oarc.ucla.edu/r/seminars/
  • Bodo Winter's tutorials: https://bodowinter.com/tutorials.html
  • Michael Clark's guide: https://m-clark.github.io/mixed-models-with-R/

Software

PackageUse Case
lme4Primary choice; fast, reliable, standard
lmerTestp-values for lme4; load before fitting
nlmeCorrelation structures; different syntax
performanceR², ICC, diagnostics; companion to lme4
brmsBayesian inference; requires Stan
glmmTMBComplex GLMMs; zero-inflation support
Stata (mixed)Good documentation
SAS (PROC MIXED)Established, powerful

TutorialFocus
LMM Random InterceptBasic random intercept model
LMM Random SlopesAdding random slopes
LMM Time-Invariant CovariatesBetween-person predictors
LMM Time-Varying CovariatesWithin-person predictors