LMM Quick Reference
Fast lookup for syntax, diagnostics, and troubleshooting. For step-by-step learning, see the Walkthrough. For conceptual background, see the Overview.
Jump to: Syntax · ML vs. REML · Extract Output · Model Comparison · Diagnostics · Parameters · Errors & Fixes · Troubleshooting · Pitfalls · Formulas · Extensions · Resources
lme4 Syntax
Random Intercept Only
mod <- lmer(y ~ time + (1 | id), data = data_long)
Random Intercept and Slope (Correlated)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
Random Intercept and Slope (Uncorrelated)
mod <- lmer(y ~ time + (1 | id) + (0 + time | id), data = data_long)
With Between-Person Predictor
mod <- lmer(y ~ time + treatment + (1 + time | id), data = data_long)
Cross-Level Interaction
mod <- lmer(y ~ time * treatment + (1 + time | id), data = data_long)
Quadratic Time
mod <- lmer(y ~ time + I(time^2) + (1 + time | id), data = data_long)
Random Slope Only (No Random Intercept)
mod <- lmer(y ~ time + (0 + time | id), data = data_long)
Nested Random Effects (Three-Level Data)
Use when you have clustering at multiple levels (e.g., observations within students within schools):
# Three-level: observations nested in students nested in schools
mod <- lmer(y ~ time + (1 + time | school/id), data = data_long)
# Equivalent explicit syntax
mod <- lmer(y ~ time + (1 + time | school) + (1 + time | school:id), data = data_long)
Note: For standard longitudinal data (observations within persons only), use the simpler (1 + time | id) syntax shown above.
Estimation: ML vs. REML
| Method | Use When | Code |
|---|---|---|
| REML (default) | Final model, reporting variances | REML = TRUE |
| ML | Comparing models with different fixed effects | REML = FALSE |
REML (Default)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
# or explicitly:
mod <- lmer(y ~ time + (1 + time | id), data = data_long, REML = TRUE)
ML (For Model Comparison)
mod <- lmer(y ~ time + (1 + time | id), data = data_long, REML = FALSE)
Workflow
- Compare models with
REML = FALSE - Select best model via
anova()or AIC/BIC - Refit final model with
REML = TRUEfor reporting
Extract Output
Summary
summary(mod)
Fixed Effects
fixef(mod)
Fixed Effects with Confidence Intervals
confint(mod, parm = "beta_", method = "Wald")
# or profile likelihood (slower, more accurate):
confint(mod, parm = "beta_", method = "profile")
Variance Components
VarCorr(mod)
Random Effects (BLUPs)
ranef(mod)$id
Fitted Values
fitted(mod)
Residuals
resid(mod)
R² (Marginal and Conditional)
library(performance)
r2(mod)
- Marginal R²: Variance explained by fixed effects
- Conditional R²: Variance explained by fixed + random effects
Intraclass Correlation (ICC)
library(performance)
icc(mod)
Coefficients Table with p-values
library(lmerTest)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
summary(mod) # Now includes p-values
Model Comparison
Likelihood Ratio Test (Nested Models)
# Must use ML for comparing different fixed effects
mod1 <- lmer(y ~ time + (1 + time | id), data = data_long, REML = FALSE)
mod2 <- lmer(y ~ time + treatment + (1 + time | id), data = data_long, REML = FALSE)
anova(mod1, mod2)
Comparing Random Effects Structures
# Can use REML when fixed effects are identical
mod_ri <- lmer(y ~ time + (1 | id), data = data_long)
mod_rs <- lmer(y ~ time + (1 + time | id), data = data_long)
anova(mod_ri, mod_rs)
Information Criteria
AIC(mod1, mod2) # Lower = better
BIC(mod1, mod2) # Lower = better; more conservative
Boundary Test Note
When testing whether a variance = 0 (e.g., random slope variance), the LRT is a boundary test. The reported p-value is conservative—the true p-value is approximately half.
Diagnostics
Residuals vs. Fitted
plot(mod)
Look for: Random scatter around 0. Problematic: Funnel shape, curves.
Q-Q Plot for Residuals
qqnorm(resid(mod))
qqline(resid(mod))
Look for: Points on line. Problematic: Heavy tails, skewness.
Q-Q Plot for Random Effects
# Random intercepts
qqnorm(ranef(mod)$id[, "(Intercept)"])
qqline(ranef(mod)$id[, "(Intercept)"])
# Random slopes
qqnorm(ranef(mod)$id[, "time"])
qqline(ranef(mod)$id[, "time"])
Check for Singular Fit
isSingular(mod) # TRUE = problem
Inspect Variance Components
VarCorr(mod)
# Check for variances at or near zero
Model Performance Summary
library(performance)
check_model(mod) # Comprehensive diagnostic plots
Parameters
| Parameter | Symbol | lme4 Location |
|---|---|---|
| Average intercept | γ₀₀ | Fixed effects: (Intercept) |
| Average slope | γ₁₀ | Fixed effects: time |
| Intercept variance | τ₀₀ | Random effects: id (Intercept) |
| Slope variance | τ₁₁ | Random effects: id time |
| I-S correlation | ρ | Random effects: Corr |
| Residual variance | σ² | Residual |
Reading lme4 Output
Random effects:
Groups Name Variance Std.Dev. Corr
id (Intercept) 98.54 9.93 <- τ₀₀
time 0.89 0.94 -0.18 <- τ₁₁, ρ
Residual 24.89 4.99 <- σ²
Fixed effects:
Estimate Std. Error ...
(Intercept) 49.923 0.732 <- γ₀₀
time 2.021 0.085 <- γ₁₀
Common Errors & Fixes
| Issue | Symptom | Fix |
|---|---|---|
| Singular fit | boundary (singular) fit | Simplify random effects; check VarCorr() |
| Time as factor | 4 dummy coefficients, not 1 slope | time <- as.numeric(time) |
| REML comparison | Invalid LRT for fixed effects | Use REML = FALSE |
| Convergence failure | "Model failed to converge" | Simplify model; scale predictors; increase iterations |
| Missing p-values | No p-values in summary | Load lmerTest before fitting |
| Data not long format | Error about missing variables | Reshape with pivot_longer() |
Troubleshooting
Singular Fit
Warning: boundary (singular) fit: see help('isSingular')
What it means: A variance component is estimated at or very near zero.
Diagnose:
VarCorr(mod) # Check which variance is near zero
Solutions:
- Remove the problematic random effect
- Use uncorrelated random effects:
(1 | id) + (0 + time | id) - Accept if theoretically justified (variance genuinely near zero)
Convergence Failure
Warning: "Model failed to converge" or "convergence code: -1"
Solutions:
# Increase iterations
mod <- lmer(y ~ time + (1 + time | id), data = data_long,
control = lmerControl(optCtrl = list(maxfun = 20000)))
# Try different optimizer
mod <- lmer(y ~ time + (1 + time | id), data = data_long,
control = lmerControl(optimizer = "bobyqa"))
# Scale predictors
data_long$time_scaled <- scale(data_long$time)
BLUPs in Secondary Analyses
Problem: Using ranef() output as data in follow-up analyses.
Why it's wrong: BLUPs have uncertainty; treating them as fixed underestimates SEs.
Solution: Include predictors in the original model:
# Wrong
re <- ranef(mod)$id
cor.test(re$`(Intercept)`, external_variable)
# Right
mod <- lmer(y ~ time + external_variable + (1 + time | id), data = data_long)
No p-values in Output
Solution: Load lmerTest before fitting:
library(lmerTest)
mod <- lmer(y ~ time + (1 + time | id), data = data_long)
summary(mod) # Now includes p-values
Negative Variance Estimate
Unlike lavaan, lme4 constrains variances to be non-negative. If you see a variance at exactly zero, it's hitting the boundary—see Singular Fit above.
Interpretation Pitfalls
| Mistake | Reality |
|---|---|
| "ICC = 0.65 means 65% explained" | ICC is variance partitioning, not variance explained |
| "Slope variance is significant, so it matters" | With large N, tiny variances are significant; interpret effect size |
| "Everyone improved" (slope mean = 2) | Check slope SD—some individuals may have negative slopes |
| "BLUPs are observed data" | BLUPs are shrunken estimates with uncertainty |
| "Marginal R² is low, model is bad" | Conditional R² captures individual differences |
| "Random effects are normally distributed" | This is an assumption—check Q-Q plots |
| "Intercept = baseline" | Only if time is coded 0 at baseline |
| "Use REML for all comparisons" | Use ML when fixed effects differ |
Quick Formulas
ICC (from random intercept model)
# Manual calculation
vc <- as.data.frame(VarCorr(mod_ri))
tau00 <- vc$vcov[1] # Intercept variance
sigma2 <- vc$vcov[2] # Residual variance
icc <- tau00 / (tau00 + sigma2)
# Or use performance package
performance::icc(mod_ri)
Intercept-Slope Correlation
# From VarCorr
vc <- VarCorr(mod)
attr(vc$id, "correlation")[1, 2]
# Manual from covariance
tau01 <- attr(vc$id, "cov")[1, 2]
tau00 <- vc$id[1, 1]
tau11 <- vc$id[2, 2]
rho <- tau01 / sqrt(tau00 * tau11)
Proportion with Positive Slopes
gamma10 <- fixef(mod)["time"]
tau11 <- VarCorr(mod)$id["time", "time"]
pnorm(0, mean = gamma10, sd = sqrt(tau11), lower.tail = FALSE)
Person-Specific Trajectory
re <- ranef(mod)$id
person_i <- 5
intercept_i <- fixef(mod)["(Intercept)"] + re[person_i, "(Intercept)"]
slope_i <- fixef(mod)["time"] + re[person_i, "time"]
Predicted Value for Person i at Time t
y_hat <- intercept_i + slope_i * t
Advanced Extensions
Time-Varying Covariates
mod <- lmer(mood ~ time + stress + (1 + time | id), data = data_long)
Centering for within/between separation:
data_long <- data_long %>%
group_by(id) %>%
mutate(
stress_between = mean(stress, na.rm = TRUE),
stress_within = stress - stress_between
)
mod <- lmer(mood ~ time + stress_within + stress_between + (1 + time | id),
data = data_long)
Cross-Level Interactions
# Does treatment moderate the slope?
mod <- lmer(y ~ time * treatment + (1 + time | id), data = data_long)
Quadratic Growth
mod <- lmer(y ~ time + I(time^2) + (1 + time | id), data = data_long)
Note: Random quadratic requires many observations per person.
Autocorrelated Residuals (nlme)
library(nlme)
mod <- lme(y ~ time,
random = ~ 1 + time | id,
correlation = corAR1(form = ~ time | id),
data = data_long)
Generalized Linear Mixed Models (GLMM)
# Binary outcome
mod <- glmer(binary_y ~ time + (1 | id), data = data_long, family = binomial)
# Count outcome
mod <- glmer(count_y ~ time + (1 | id), data = data_long, family = poisson)
Bayesian Mixed Models (brms)
library(brms)
mod <- brm(y ~ time + (1 + time | id),
data = data_long,
family = gaussian(),
chains = 4, cores = 4)
Resources
Books
| Book | Focus |
|---|---|
| Singer & Willett (2003). Applied Longitudinal Data Analysis. Oxford. | Excellent pedagogy |
| Snijders & Bosker (2012). Multilevel Analysis. Sage. | Comprehensive MLM |
| Raudenbush & Bryk (2002). Hierarchical Linear Models. Sage. | Classic reference |
| Gelman & Hill (2007). Data Analysis Using Regression and Multilevel Models. Cambridge. | Practical, Bayesian-friendly |
| West, Welch, & Galecki (2014). Linear Mixed Models. CRC. | Software-focused |
Key Articles
| Article | Contribution |
|---|---|
| Bates et al. (2015). Fitting linear mixed-effects models using lme4. JSS. | Definitive lme4 reference |
| Barr et al. (2013). Random effects structure for confirmatory testing. JML. | "Keep it maximal" argument |
| Matuschek et al. (2017). Balancing Type I error and power. JML. | Counter to maximal approach |
| Luke (2017). Evaluating significance in LMMs in R. BRM. | p-value methods comparison |
Online
- lme4 vignette:
vignette("lmer", package = "lme4") - UCLA IDRE: https://stats.oarc.ucla.edu/r/seminars/
- Bodo Winter's tutorials: https://bodowinter.com/tutorials.html
- Michael Clark's guide: https://m-clark.github.io/mixed-models-with-R/
Software
| Package | Use Case |
|---|---|
| lme4 | Primary choice; fast, reliable, standard |
| lmerTest | p-values for lme4; load before fitting |
| nlme | Correlation structures; different syntax |
| performance | R², ICC, diagnostics; companion to lme4 |
| brms | Bayesian inference; requires Stan |
| glmmTMB | Complex GLMMs; zero-inflation support |
| Stata (mixed) | Good documentation |
| SAS (PROC MIXED) | Established, powerful |
Related Tutorials
| Tutorial | Focus |
|---|---|
| LMM Random Intercept | Basic random intercept model |
| LMM Random Slopes | Adding random slopes |
| LMM Time-Invariant Covariates | Between-person predictors |
| LMM Time-Varying Covariates | Within-person predictors |
Related Guides
- LMM Overview — When to use, key concepts
- LMM Walkthrough — Step-by-step worked example
- LGCM Overview — SEM-based growth curves (equivalent for basic models)
- GLMM Overview — Generalized mixed models for non-continuous outcomes
- GEE Overview — Population-averaged alternative to mixed models