When Your Focus Is the Population, Not the Person

Some longitudinal questions are about population-level patterns:

Does this intervention reduce the average prevalence of substance use?
Is the population rate of emergency visits declining over time?
Across all participants, does the average count differ by treatment group?

These are marginal questions — they ask what happens on average across the population. Answering them requires accounting for repeated measures on the same person, but it doesn't require modeling each person's individual trajectory.

Generalized Estimating Equations (GEE) are designed for exactly this. GEE takes a semi-parametric approach:

Specify how the mean relates to predictors (the regression model)
Specify a working correlation structure to approximate within-person dependence
Use a sandwich estimator to produce valid standard errors even if the working correlation is wrong

The result: consistent, efficient estimates of population-averaged effects with robust inference and minimal distributional assumptions.

Population-Averaged Trends — Faded lines are individual probability trajectories — what a mixed model would estimate. Bold curves are population-averaged trends — what GEE estimates. GEE answers "what happens on average?" without modeling person-specific variation.

Tip

Before You Continue

GEE and GLMM answer fundamentally different questions — neither is "better." If you need individual trajectories, random effect variances, or subject-specific predictions, see GLMM. If you need population-averaged effects with minimal distributional assumptions, read on.

What GEE Provides

Population-Averaged (Marginal) Effects

GEE estimates what happens to the average person in the population — not a specific individual. For binary outcomes, the coefficient describes how the population prevalence changes with a predictor. For counts, it describes how the population rate changes.

This distinction matters most for non-linear models. In a logistic GLMM, the odds ratio is conditional — it applies to a specific person holding their random effect constant. In GEE, the odds ratio is marginal — it describes the population-level association. The marginal OR is always closer to 1.0 than the conditional OR because averaging over heterogeneous individuals attenuates the effect.

Robust Standard Errors

GEE's signature feature is the sandwich (Huber-White) estimator. It produces valid standard errors and confidence intervals even when the working correlation structure is misspecified. This is a remarkable property: you can get the correlation structure wrong and still get valid inference for the regression coefficients.

The trade-off: if the working correlation is close to the truth, robust SEs are slightly less efficient than model-based (naive) SEs. If it's far from the truth, robust SEs protect you while naive SEs don't.

Minimal Distributional Assumptions

GEE doesn't require a full likelihood specification. You need:

A correct mean model (how predictors relate to the outcome)
A variance function (how variance relates to the mean)
A working correlation structure (can be wrong — robust SEs cover you)

You do not need:

A correctly specified random effects distribution
The true correlation structure
Normality of anything

This makes GEE particularly attractive when you distrust distributional assumptions or when the random effects distribution is genuinely non-normal.

Straightforward Interpretation

GEE coefficients have the same interpretation as coefficients from a standard GLM fit to independent data — just with appropriate standard errors for the correlation. No random effects to condition on, no link-scale complications for marginal summaries.

Handles Unbalanced Data Gracefully

Like LMM and GLMM, GEE naturally handles unbalanced designs — participants can have different numbers of observations, measurement timing can vary, and some waves can be missing entirely. The estimating equations use all available observations from each person.

Flexible Time Structures

Time can be equally spaced (0, 1, 2, 3, 4), unequally spaced (0, 3, 6, 12, 24 months), or person-specific (actual measurement dates). Code time as a continuous variable with appropriate values — the estimating equation machinery works the same regardless of spacing.

Missing Data: Important Limitation

Unlike LMM and GLMM (which handle MAR), standard GEE assumes Missing Completely At Random (MCAR) — the strongest and least realistic missing data assumption. If dropout depends on observed variables, GEE estimates may be biased without correction. See Practical Considerations for mitigations including weighted GEE and multiple imputation.

When GEE is Appropriate

Requirement	Guideline
Repeated measures	3+ waves (GEE works with 2, but limited)
Interest in population averages	Not individual trajectories
Adequate clusters	40+ individuals minimum; 100+ recommended for robust SEs
Outcome type	Binary, count, ordinal
Missing data mechanism	MCAR or approximately MCAR (see below)

Key Components

GEE has two distinctive building blocks beyond the mean model itself. Instead of specifying a full probability model (as GLMM does), GEE finds parameter values that make weighted residuals sum to zero across all participants — where the weights come from a working correlation structure and inference is protected by the sandwich estimator. See Mathematical Foundations for formal notation.

Working Correlation Structures

The working correlation matrix R(α) specifies how observations within a person are related. GEE offers several options:

Structure	Pattern	Assumption	Parameters
Independence	All off-diagonals = 0	No within-person correlation	0
Exchangeable	All pairs equally correlated	Compound symmetry	1 (α)
AR(1)	Correlation decays with lag	Autoregressive	1 (α)
Unstructured	Each pair has its own correlation	No pattern assumed	T(T−1)/2

Working Correlation Structure Comparison — The four standard working correlation structures for 5 waves with α = 0.45. Independence assumes no correlation; exchangeable assumes all pairs are equally correlated; AR(1) shows exponential decay with lag; unstructured allows each pair its own value.

Note

The working correlation doesn't have to be right

With robust SEs, the choice of working correlation affects efficiency (precision) but not consistency (validity). A poorly specified working correlation gives wider confidence intervals than necessary, but they still have correct coverage. A close-to-correct working correlation gives tighter intervals.

The Sandwich Estimator

GEE's robustness comes from how it estimates the covariance of β̂:

Var(β̂) = B⁻¹ M B⁻¹

Where:

B = model-based (naive) information matrix — what you'd get if the working correlation were correct
M = "meat" — empirical correction based on observed residuals

The "bread-meat-bread" sandwich structure means:

If the working correlation is correct: B⁻¹ M B⁻¹ ≈ B⁻¹ (sandwich ≈ naive)
If the working correlation is wrong: B⁻¹ alone is invalid, but the sandwich correction fixes it

Small-sample adjustment: With fewer than 40–50 clusters, the sandwich estimator can underestimate standard errors. Use bias-corrected variants (e.g., sandwich = "MD" or sandwich = "KC" in some packages).

Mathematical Foundations (optional formal notation)

The GEE Framework

For person i with nᵢ observations, the marginal mean is:

g(μᵢₜ) = x'ᵢₜβ

Where g(·) is the link function (same choices as GLMM: logit, log, identity).

The marginal variance is:

Var(yᵢₜ) = φ · v(μᵢₜ)

Where v(·) is the variance function and φ is the dispersion parameter.

Estimating Equations

The GEE estimator β̂ solves:

U(β) = Σᵢ₌₁ᴺ D'ᵢ Vᵢ⁻¹ Sᵢ = 0

Where:

Sᵢ = yᵢ − μᵢ(β) is the vector of residuals for person i
Dᵢ = ∂μᵢ/∂β is the matrix of derivatives
Vᵢ = φ Aᵢ^(1/2) R(α) Aᵢ^(1/2) is the working covariance
Aᵢ = diag{v(μᵢ₁), …, v(μᵢₙᵢ)} contains variance function values
R(α) is the working correlation matrix

Consistency

Under correct specification of the mean model E(yᵢₜ) = g⁻¹(x'ᵢₜβ), β̂_GEE is consistent for β regardless of the working correlation R(α).

Asymptotic Variance

√N(β̂ − β) → N(0, B⁻¹ M B⁻¹)

Where:

B = Σᵢ D'ᵢ Vᵢ⁻¹ Dᵢ
M = Σᵢ D'ᵢ Vᵢ⁻¹ Sᵢ S'ᵢ Vᵢ⁻¹ Dᵢ

This is the sandwich variance — valid under misspecification of R(α).

QIC (Quasi-likelihood Information Criterion)

QIC = -2Q(β̂ᵣ; I) + 2 trace(Ω̂ᵢ V̂ᵣ)

Where Q is the quasi-likelihood evaluated under independence working correlation, and the penalty adjusts for model complexity. Lower QIC = better fit.

Interpretation

GEE coefficients are marginal (population-averaged) effects on the link scale — the same scale as a standard GLM. This is GEE's interpretive advantage: no random effects to condition on.

Binary Outcomes (Logistic GEE)

Coefficients are log-odds of the population proportion. Exponentiate for marginal odds ratios:

Parameter	Link Scale	Natural Scale
Intercept	Log-odds of population prevalence at time = 0	Prevalence = logit⁻¹(β₀)
Time coefficient	Change in log-odds per unit time	Marginal OR = exp(β₁)
Predictor coefficient	Difference in log-odds	Marginal OR = exp(β)

Example: β₁ = -0.25 for time → OR = exp(-0.25) = 0.78. Across the population, the odds of the outcome decrease by ~22% per wave.

Marginal vs. conditional: The marginal OR from GEE is always closer to 1.0 than the conditional OR from GLMM. This isn't bias — they answer different questions. The gap grows with random effect variance: more individual heterogeneity means more attenuation when averaging across individuals.

Count Outcomes (Log-link GEE)

Coefficients are on the log scale. Exponentiate for marginal incidence rate ratios (IRR):

Parameter	Link Scale	Natural Scale
Intercept	Log population rate at time = 0	Population rate = exp(β₀)
Time coefficient	Change in log rate per unit time	Marginal IRR = exp(β₁)
Predictor coefficient	Difference in log rate	Marginal IRR = exp(β)

Example: β₁ = 0.10 for time → IRR = exp(0.10) = 1.11. Across the population, the expected rate increases by ~11% per wave.

Note

Continuous outcomes? For continuous outcomes with an identity link, GEE coefficients are identical to LMM fixed effects — use LMM instead.

Interactive Exploration

To build deeper intuition for how GEE works, use the interactive explorer below. Adjust the correlation structure and parameters to see how the working correlation matrix changes, how population-averaged (GEE) curves differ from conditional (GLMM) trajectories, and when robust SEs diverge from naive SEs.

This tool lets you:

Switch correlation structures — see how Independence, Exchangeable, AR(1), and Unstructured produce different R(α) matrices for the same α
Adjust α — watch the heatmap update in real-time as within-person correlation strengthens or weakens
Compare marginal vs conditional — the bold red GEE curve shows the population average; faded lines show individual GLMM trajectories. Increase τ² to see the curves diverge
Examine SE behavior — the bottom bar chart shows when robust SEs protect you from a misspecified working correlation (try Independence with high α)

Practical Considerations

Missing Data: The MCAR Requirement

GEE's most important limitation is its assumption about missing data:

Mechanism	Definition	GEE Valid?
MCAR	Missingness is completely random — unrelated to any variables	Yes
MAR	Missingness depends on observed variables	Biased without correction
MNAR	Missingness depends on unobserved values	Biased

Standard GEE assumes MCAR — the strongest and least realistic assumption. If participants who are getting worse are more likely to drop out (MAR), GEE estimates will be biased.

Mitigations for MAR dropout:

Weighted GEE (WGEE): weight each observation by the inverse probability of being observed
Multiple imputation + GEE: impute missing values, fit GEE to each imputed dataset, pool results
Use GLMM instead — FIML/REML handles MAR naturally

Number of Clusters

The sandwich estimator is an asymptotic result — it works well with many clusters. With few clusters:

Clusters	Recommendation
< 20	Avoid GEE or use bias-corrected SEs with extreme caution
20–40	Use bias-corrected sandwich (Mancl-DeRouen or Kauermann-Carroll)
40–100	Standard sandwich is acceptable; bias-corrected is preferable
100+	Standard sandwich works well

Choosing a Correlation Structure

In practice, the choice rarely matters much for point estimates (they're consistent regardless). It affects efficiency:

Start with exchangeable — a safe default for most longitudinal data
Compare with AR(1) if you expect temporal decay
Use QIC (quasi-likelihood information criterion) to compare structures
Check that robust and naive SEs are reasonably close — large discrepancies suggest the working correlation is far from reality

library(geepack)

fit_exch  <- geeglm(y ~ time + group, id = id, data = df,
                     family = binomial, corstr = "exchangeable")
fit_ar1   <- geeglm(y ~ time + group, id = id, data = df,
                     family = binomial, corstr = "ar1")
fit_indep <- geeglm(y ~ time + group, id = id, data = df,
                     family = binomial, corstr = "independence")

Common Pitfalls

Pitfall

These mistakes are common but avoidable:

Pitfall	Mistake	Fix
Treating GEE coefficients as conditional	Interpreting GEE OR as applying to a specific person	GEE gives marginal (population-averaged) effects; marginal OR is always closer to 1.0 than conditional OR
Ignoring MCAR assumption with dropout	Using standard GEE when dropout is related to the outcome	If dropout is MAR, GEE is biased; use weighted GEE, multiple imputation, or GLMM with FIML
Using too few clusters	Fitting GEE with < 40 participants and trusting sandwich SEs	Sandwich estimator needs large N (clusters, not observations); use bias-corrected variants or GLMM
Over-interpreting working correlation	Reporting estimated α as the "true" within-person correlation	Working correlation is a nuisance parameter for efficiency, not a reliable correlation estimate
Comparing GEE and GLMM coefficients directly	Concluding one is wrong because ORs differ	They estimate different quantities; marginal OR < conditional OR by design
Using naive SEs instead of robust	Reporting model-based SEs without sandwich correction	Naive SEs require exactly correct working correlation; always report robust SEs
Unstructured correlation with many time points	Using `corstr = "unstructured"` with 10+ waves	Requires T(T−1)/2 parameters; becomes unstable with limited clusters; use exchangeable or AR(1)
Not checking robust vs. naive SE agreement	Never comparing the two SE sets	Large discrepancies (> 20%) signal misspecified working correlation; useful diagnostic
GEE for clustered data without thought	Using GEE for any clustering without considering cluster-level effects	GEE averages over clusters; if cluster variation matters substantively, use multilevel models
QIC as likelihood-based criterion	Using QIC as if it were AIC with the same properties	QIC is quasi-likelihood-based; useful for relative comparison within GEE, but not as theoretically grounded

Summary

GEE provides population-averaged estimates for correlated data with minimal distributional assumptions:

Marginal effects describe population averages — how the outcome changes across the population, not for a specific individual
Working correlation structures approximate within-person dependence; the choice affects efficiency but not consistency
Sandwich standard errors produce valid inference even when the working correlation is wrong — GEE's defining feature
MCAR assumption is GEE's main limitation — informative dropout biases estimates without correction
Fewer assumptions than GLMM — no random effects distribution required — but also fewer capabilities (no individual predictions, no variance components)

The choice between GEE and GLMM isn't about which is "better" — it's about which question you're asking. Use GEE when the population average is the quantity of interest and you want robust, assumption-light inference.

Info

Scope

This overview covers standard GEE for binary and count outcomes with a single clustering variable (persons). Not covered: weighted GEE for MAR dropout, ordinal GEE (see Reference for ordgee syntax), multinomial GEE (multgee), penalized GEE, doubly robust estimators, and alternating logistic regressions.

Next Steps

Walkthrough: Worked Example → Step-by-step R code to fit GEE models, compare correlation structures, and examine robust SEs.