Skip to main content
Effective Health Care Program

Change Score or Followup Score? An Empirical Evaluation of the Impact of Choice of Mean Difference Estimates

White Paper

People using assistive technology may not be able to fully access information in these files. For additional assistance, please contact us.

Structured Abstract


In randomized controlled clinical trials, continuous outcomes are typically measured at both baseline and followup time points, and mean difference is analyzed as the effect measure. There are multiple ways to estimate the mean difference: using the change score from the baseline, using the followup scores, or estimating the mean difference using the analysis of covariance (ANCOVA) model. Use of the ANCOVA model is generally preferable to using the change scores from the baseline or using the followup scores. When the baseline scores are imbalanced, using either the change score from the baseline or the followup scores would produce biased effect estimates of mean difference, while the ANCOVA model provides the least biased estimates. Nonetheless, individual studies often report results incompletely, and investigators have to summarize results across studies that are not optimally reported. The impact of using the change versus the followup score on meta-analysis has not been well studied.


We selected six comparative effectiveness reviews published by the Agency for Healthcare Research and Quality that included at least one meta-analysis for continuous outcomes using mean difference. Data were abstracted from a total of 63 meta-analyses (156 trials) to evaluate differences in baseline scores and how the choice of using the change score or the followup score impacted the combined mean difference using a random effects model and discrepancy in conclusions. Discrepancy in conclusion occurs when one estimate (e.g., change score) shows significant difference and the other estimate (e.g., followup score) does not. We also evaluated whether the impact qualitatively varied by the comparator and alternative random effect estimates.


Based on the Dersimonian-Laird (DL) method, using the change score versus the followup score led to 5 out of the 63 meta-analyses (7.9%) showing discrepancy in conclusions; based on the profile likelihood (PL) method, 9 (14.3%) showed discrepancy in conclusions. Using the change score was more likely to show a significant difference in effects between interventions (4 out of 5 using the DL method, and 7 out of 9 using the PL method). The impact of the change score versus the followup score using the maximum likelihood was similar to using the DL method, and the impact using the restricted maximum likelihood method was similar to using the PL method. Using the Knapp-Hartung modification of random effect estimate led to most (10) meta-analyses showing discrepancy in conclusions. A significant difference in baseline scores did not necessarily lead to discrepancy in conclusion. Finally, among the 10 meta-analyses that compared active intervention versus control or usual care, using the change score versus the followup score led to one discrepancy in conclusion using the DL or the PL method (10%) but using change scores consistently produced larger intervention effects in nine meta-analyses. Among the other 53 meta-analyses comparing different interventions, there were 4 discrepancies using the DL method (7.5%) and 8 discrepancies using the PL method (15.1%).


This study of 63 meta-analysis indicated that using the change score versus the followup score to estimate mean difference could lead to important discrepancies in conclusions. Using the change score is more likely to produce significant results when there are discrepancies in conclusions; using the followup score is more likely to produce more conservative results. Sensitivity analyses using both change scores and final values should be conducted to check the robustness of results to the choice of mean difference estimates.

Journal Publications

Fu R, Holmer HK. Change score or follow-up score? Choice of mean difference estimates could impact meta-analysis conclusions. J Clin Epidemiol. 2016 Feb 27 [Epub ahead of print]. DOI: 10.1016/j.jclinepi.2016.01.034. PMID: 26931293.